Azure: importing not already existing packages in 'src' - azure

I have an experiment in which a module R script uses functions defined in a zip source (Data Exploration). Here it's described how to do about the packages not already existing in the Azure environment.
The DataExploration module has been imported from a file Azure.zip containing all the packages and functions I need (as shown in the next picture).
When I run the experiment nothing goes wrong. At the contrary, watching the log it seems clear that Azure is able to manage the source.
The problem is that, when I deploy the web service (classic), if I run the experiment I get the following error:
FailedToEvaluateRScript: The following error occurred during
evaluation of R script: R_tryEval: return error: Error in
.zip.unpack(pkg, tmpDir) : zip file 'src/scales_0.4.0.zip' not found ,
Error code: LibraryExecutionError, Http status code: 400, Timestamp:
Thu, 21 Jul 2016 09:05:25 GMT
It's like he cannot see the scales_0.4.0.zip into the 'src' folder.
The strange fact is that all used to work until some days ago. Then I have copied the experiment on a second workspace and it gives me the above error.
I have also tried to upload again the DataExploration module on the new workspace, but it's the same.

I have "solved" thanks to the help of the AzureML support: it is a bug they are trying to solve right now.
The bug shows up when you have more R script modules, and the first has no a zip input module while the following have.
Workaround: connect the zip input module to the first R script module too.

Related

Airflow can't reach logs from webserver due to 403 error

I use Apache Airflow for daily ETL jobs. I installed it in Azure Kubernetes Service using the provided Helm chart. It's been running fine for half a year, but since recently I'm unable to access the logs in the webserver (this used to always work fine).
I'm getting the following error:
*** Log file does not exist: /opt/airflow/logs/dag_id=analytics_etl/run_id=manual__2022-09-26T09:25:50.010763+00:00/task_id=copy_device_table/attempt=18.log
*** Fetching from: http://airflow-worker-0.airflow-worker.default.svc.cluster.local:8793/dag_id=analytics_etl/run_id=manual__2022-09-26T09:25:50.010763+00:00/task_id=copy_device_table/attempt=18.log
*** !!!! Please make sure that all your Airflow components (e.g. schedulers, webservers and workers) have the same 'secret_key' configured in 'webserver' section and time is synchronized on all your machines (for example with ntpd) !!!!!
****** See more at https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#secret-key
****** Failed to fetch log file from worker. Client error '403 FORBIDDEN' for url 'http://airflow-worker-0.airflow-worker.default.svc.cluster.local:8793/dag_id=analytics_etl/run_id=manual__2022-09-26T09:25:50.010763+00:00/task_id=copy_device_table/attempt=18.log'
For more information check: https://httpstatuses.com/403
What have I tried:
I've made sure that the log file exists (I can exec into the airflow-worker-0 pod and read the file on command line in the location specified in the error).
I've rolled back my deployment to an earlier commit from when I know for sure it was still working, but it made no difference.
I was using webserverSecretKeySecretName in the values.yaml configuration. I changed the secret to which that name was pointing (deleted it and created a new one, as described here: https://airflow.apache.org/docs/helm-chart/stable/production-guide.html#webserver-secret-key) but it didn't work (no difference, same error).
I changed the config to use a webserverSecretKey instead (in plain text), no difference.
My thoughts/observations:
The error states that the log file doesn't exist, but that's not true. It probably just can't access it.
The time is the same in all pods (I double checked be exec-ing into them and typing date in the command line)
The webserver secret is the same in the worker, the scheduler, and the webserver (I double checked by exec-ing into them and finding the corresponding env variable)
Any ideas?
Turns out this was a known bug with the latest release (2.4.0) of the official Airflow Helm chart, reported here:
https://github.com/apache/airflow/discussions/26490
Should be resolved in version 2.4.1 which should be available in the next couple of days.

Unable to deploy/update google cloud function

I have a Firebase project with 29 functions 2 with python and 27 with nodejs.
Modified 2 of them and now I can't deploy properly. I get an error log that send me to the logviewer and one of the errors is:
ERROR: build step 3
"us.gcr.io/fn-img/buildpacks/nodejs10/builder:nodejs10_20201201_20_RC00"
failed: step exited with non-zero status: 46
The functions keep on working, but I can't update/deploy properly. When I try to deploy them individually I get that error for both functions, but when I try to deploy ALL the functions I only get the error with those 2 functions the rest of the functions, that don't have any modification have no problem redeploying.
I checked the source code in the Cloud console and they have a warning icon saying that:
Function is active, but last deployment failed
The source code in the Cloud console is the same as the one I'm trying to deploy but the functions has the same functionality that before when I made the changes, the functions still works but can't update.
These are javascript functions that I deployed using the Firebase Node Sdk.
Any help?
EDIT I:
I reverted the changes on one of the functions that's been there for over 2 years and still have the same issue, can't update/deploy, that function triggers on storage.onFinalize().
The other function on firestore.onCreate()
EDIT II:
The newest function that I created is not in use, is part of a new feature in my android application, so I duplicated it, gave it different name and deployed without issues. In that case I could delete the original function without any issue as is not being used. But I can't do the same for the other function, the other function is constantly in use.

"Error: Key not loaded" in h2o deployed through a K3s cluster, using python3 client

I can confirm the 3-replica cluster of h2o inside K3s is correctly deployed, as executing in the Python3 interpreter h2o.init(ip="x.x.x.x") works as expected. I followed the instructions noted here: https://www.h2o.ai/blog/running-h2o-cluster-on-a-kubernetes-cluster/
Nevertheless, I had to modify the service.yaml and comment out the line which says clusterIP: None, as K3s was complaining about something related to its inability to set the clusterIP to None. But even though, I can certify it is working correctly, and I am able to use an external IP to connect to the cluster.
If I try to load the dataset using the h2o cluster inside the K3s cluster using the exact same steps as described here http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html, this is the output that I get:
>>> train = h2o.import_file("https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")
...
h2o.exceptions.H2OResponseError: Server error java.lang.IllegalArgumentException:
Error: Key not loaded: Key<Frame> https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv
Request: POST /3/ParseSetup
data: {'check_header': '0', 'source_frames': '["https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv"]'}
The same error occurs if I use the h2o.upoad_file("x.csv") method.
There is a clue about what may be happening here: Key not loaded: Key<Frame> while POSTing source frame through ParseSetup in H2O API call but I am not using curl, and I can not find any parameter that could help me overcome this issue: http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/h2o.html?highlight=import_file#h2o.import_file
I need to use the Python client inside the same K3s cluster due to different technical reasons, so I am not able to kick off nor Flow nor Firebug to know what may be happening.
I can confirm it is working correctly when I simply issue a h2o.init(), using the local Java instance.
UPDATE 1:
I have tried in different K3s clusters without success. I changed the service.yaml to a NodePort, and now this is the error traceback:
>>> train = h2o.import_file("https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")
...
h2o.exceptions.H2OResponseError: Server error java.lang.IllegalArgumentException:
Error: Job is missing
Request: GET /3/Jobs/$03010a2a016132d4ffffffff$_a2366be93ec99a78d7bc161de8c54d67
UPDATE 2:
I have tried using different services (NodePort, LoadBalancer, ClusterIP) and none of them work. I also have tried using Minikube with the official image, and with a custom image made by me, without success. I suspect this is something related to either h2o itself, or the clustering between pods. I will keep digging and let's think there will be some gold in it.
UPDATE 3:
I also found out that the post about running H2O in Docker is really outdated https://www.h2o.ai/blog/h2o-docker/ nor is working the Dockerfile present at GitHub (I changed it to uncomment the ENTRYPOINT section without success): https://github.com/h2oai/h2o-3/blob/master/Dockerfile
Even though, I tried with the custom image I built for h2o-k8s and it is working seamlessly in pure Docker. I am wondering why it is still not working in K8s...
UPDATE 4:
I have tried modifying the environment variable called H2O_KUBERNETES_SERVICE_DNS without success.
In the meantime, the cluster started to be unavailable, that is, the readinessProbe's would not successfully complete. No matter what I change now, it does not work.
I spinned up a K3d cluster in local to see what happened, and surprisingly, the readinessProbe's were not failing, using v3.30.0.6. But now I started testing it with R instead of Python. I am glad I tried, because I may have pinpointed what was wrong. There is a version mismatch between the client and the server. So I updated accordingly the image to v3.30.0.1.
But now again, the readinessProbe is not working in my k3d cluster, so I am unable to test it.
It seems it is working now. R client version 3.30.0.1 with server version 3.30.0.1. Also tried with Python version 3.30.0.7 and server version 3.30.0.7 and it started working. Marvelous. The problem was caused by a version mismatch between the client and the server, as the python client was updated to 3.30.0.7 while the latest server for docker was 3.30.0.6.

kentico 9 cmsdesk system error after admin login

This is a fun one. Just moved my build to the dev environment. I'm getting a system error when trying to access the CMS Desk. I can't see the event log to trouble shoot. How i can go about finding possible issues. I had admin access to the DEV VM, but not the SQL box.
I'm currently on hot fix 30, and my local dev is fine.
I've noticed that the Kentico CMS Health Monitor and Scheduler services didn't start. Manualy starting these gives an error.
When i attempted to login in from the VM, i get this error.
Server Error in '/' Application.
Compilation Error
Description: An error occurred during the compilation of a resource required to service this request. Please review the following specific error details and modify your source code appropriately.
Compiler Error Message: CS0433: The type 'CMSAdminControls_Basic_OrderByControl' exists in both 'c:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\a8d48e58\742913f6\assembly\dl3\864636c6\8ed0e2be_5525d201\CMSApp.DLL' and 'c:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root\a8d48e58\742913f6\App_Web_orderbycontrol.ascx.cffe6b5c.9lvruqx-.dll'
Source Error:
Line 87:
Line 88:
Line 89:
Line 91:
Source File: c:\inetpub\wwwroot\Kentico9\CMS\CMSAdminControls\UI\UniGrid\Controls\AdvancedExport.ascx Line: 89
Try to clean (delete) content from that temporary folder.
Looks like Kentico was setup as a Web Site, rather than a Web Application. Just as easy to nuke, and reinstall.

Server Error in '/DotNetNuke_Community' Application

I'm getting the following error when attempting to run DotNetNuke 7.1 from IIS.
Object reference not set to an instance of an object.
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.NullReferenceException: Object reference not set to an instance of an object.
Source Error:
Line 572: //first call GetProviderPath - this insures that the Database is Initialised correctly
Line 573: //and also generates the appropriate error message if it cannot be initialised correctly
Line 574: string strMessage = DataProvider.Instance().GetProviderPath();
Line 575: //get current database version from DB
Line 576: if (!strMessage.StartsWith("ERROR:"))
I've tried running it from Visual Studio 2012 after downloading and extracting the source code to a folder, then running, but I get the same error (also, VS loads about 13 instances of it's built in webserver which can't be correct).
Clearly, there is something wrong with the database. From what I've read in the past, there should have been a start up configuration page (for configuring settings the first time you run the project).
I did look at the local version of IIS (running on Windows 8) and it created the site fine there, however, for some reason the internal webserver attempts to run (and the option to run on an external IIS is greyed out).
Anyone run into this problem with DNN Community edition? I've tried running as admin and setting permissions with no luck at all.
Any way to fix this?
Ok, the key is to delete the Database.mdf file completely.
Then create a new empty database of your choice in SQL Server (2008 or greater).
Create a new user account with db_owner access (as it must be able to create tables, etc).
Change the connection strings in the release.config and development.config to connect to the database.
DELETE the web.config file.
RENAME either config file to "web.config"
Set the default project to the web project in VS
set the default page to default.aspx
Run
I made the erroneous assumption that running the app would rename the config file for me (not sure why I assumed that).
SOLVED!

Resources