Seldon Core Loading sklearn/irir failed - seldon

I tried to load the iris model using seldon core, and unfortunately the following error occurred.
SKLEARN_SERVER loads seldon’s sklearn/iris model with the following error.
starting microservice
2021-09-02 02:43:19,363 - seldon_core.microservice:main:206 - INFO: Starting microservice.py:main
2021-09-02 02:43:19,363 - seldon_core.microservice:main:207 - INFO: Seldon Core version: 1.10.0
2021-09-02 02:43:19,463 - seldon_core.microservice:main:362 - INFO: Parse JAEGER_EXTRA_TAGS []
2021-09-02 02:43:19,463 - seldon_core.microservice:load_annotations:158 - INFO: Found annotation kubernetes.io/config.seen:2021-09-02T02:41:35.820784600Z
2021-09-02 02:43:19,463 - seldon_core.microservice:load_annotations:158 - INFO: Found annotation kubernetes.io/config.source:api
2021-09-02 02:43:19,463 - seldon_core.microservice:load_annotations:158 - INFO: Found annotation prometheus.io/path:/stats/prometheus
2021-09-02 02:43:19,463 - seldon_core.microservice:load_annotations:158 - INFO: Found annotation prometheus.io/port:15020
2021-09-02 02:43:19,463 - seldon_core.microservice:load_annotations:158 - INFO: Found annotation prometheus.io/scrape:true
2021-09-02 02:43:19,464 - seldon_core.microservice:load_annotations:158 - INFO: Found annotation sidecar.istio.io/status:{\"initContainers\":[\"istio-init\"],\"containers\":[\"istio-proxy\"],\"volumes\":[\"istio-envoy\",\"istio-data\",\"istio-podinfo\",\"istio-token\",\"istiod-ca-cert\"],\"imagePullSecrets\":null}
2021-09-02 02:43:19,559 - seldon_core.microservice:main:365 - INFO: Annotations: {'kubernetes.io/config.seen': '2021-09-02T02:41:35.820784600Z', 'kubernetes.io/config.source': 'api', 'prometheus.io/path': '/stats/prometheus', 'prometheus.io/port': '15020', 'prometheus.io/scrape': 'true', 'sidecar.istio.io/status': '{\\"initContainers\\":[\\"istio-init\\"],\\"containers\\":[\\"istio-proxy\\"],\\"volumes\\":[\\"istio-envoy\\",\\"istio-data\\",\\"istio-podinfo\\",\\"istio-token\\",\\"istiod-ca-cert\\"],\\"imagePullSecrets\\":null}'}
2021-09-02 02:43:19,559 - seldon_core.microservice:main:369 - INFO: Importing SKLearnServer
2021-09-02 02:43:20,562 - SKLearnServer:__init__:21 - INFO: Model uri: /mnt/models
2021-09-02 02:43:20,563 - SKLearnServer:__init__:22 - INFO: method: predict_proba
2021-09-02 02:43:20,564 - SKLearnServer:load:26 - INFO: load
2021-09-02 02:43:20,565 - root:download:31 - INFO: Copying contents of /mnt/models to local
2021-09-02 02:43:20,659 - SKLearnServer:load:30 - INFO: model file: /mnt/models/model.joblib
Traceback (most recent call last):
File "/opt/conda/bin/seldon-core-microservice", line 8, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.7/site-packages/seldon_core/microservice.py", line 379, in main
user_object = user_class(**parameters)
File "/microservice/SKLearnServer.py", line 23, in __init__
self.load()
File "/microservice/SKLearnServer.py", line 31, in load
self._joblib = joblib.load(model_file)
File "/opt/conda/lib/python3.7/site-packages/joblib/numpy_pickle.py", line 585, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "/opt/conda/lib/python3.7/site-packages/joblib/numpy_pickle.py", line 504, in _unpickle
obj = unpickler.load()
File "/opt/conda/lib/python3.7/pickle.py", line 1088, in load
dispatch[key[0]](self)
File "/opt/conda/lib/python3.7/pickle.py", line 1376, in load_global
klass = self.find_class(module, name)
File "/opt/conda/lib/python3.7/pickle.py", line 1426, in find_class
__import__(module, level=0)
ModuleNotFoundError: No module named 'sklearn.linear_model.logistic'
It looks like a version issue with the sklearn package in seldon's sklearn inference server. This is my seldonDeployment file:
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: "sklearn"
spec:
name: "sklearn"
predictors:
- componentSpecs:
- spec:
containers:
- name: classifier
env:
- name: GUNICORN_THREADS
value: "10"
- name: GUNICORN_WORKERS
value: "1"
resources:
requests:
cpu: 5m
memory: 10Mi
limits:
cpu: 50m
memory: 100Mi
graph:
children: []
implementation: SKLEARN_SERVER
modelUri: gs://seldon-models/sklearn/iris
name: classifier
name: default
replicas: 2
This is my sklearn Inference server configuration:
"SKLEARN_SERVER":{
"protocols":{
"kfserving":{
"defaultImageVersion":"0.3.2",
"image":"seldonio/mlserver"
},
"seldon":{
"defaultImageVersion":"1.10.0",
"image":"seldonio/sklearnserver"
}
}
}
Is there something wrong with me?

This is because the version of the seldon core does not match the version of the model. Note that the example model for seldon-core version 1.10.0 is under gs://seldon-models/v1.11.0-dev.

Related

Files in AIRFLOW_HOME (which is an Azure File Share MOUNT) are created as root

I have set up the airflow in Azure Cloud (Azure Container Apps) and attached an Azure File Share as an external mount/volume
1. I ran **airflow init service**, it had created the airflow.cfg and `'webserver_config.py'` file in the **AIRFLOW_HOME (/opt/airflow)**, which is actually an azure mounted file system
2. I ran **airflow webserver service**, it had created the `airflow-webserver.pid` file in the **AIRFLOW_HOME (/opt/airflow)**, which is actually an azure mounted file system
Now the problem is all the files created above are created with root user&groups, not as airflow user(50000),
I have also set the env variable AIRFLOW_UID to 50000 during the creation of the container app. due to this my webservers are not starting, throwing the below error
PermissionError: [Errno 1] Operation not permitted: '/opt/airflow/airflow-webserver.pid'
Note: Azure Containers Apps does not allow use root/sudo commands, otherwise I could solve this problem with simple chown commands
Another problem is the airflow configurations passed through environment variables are never picked up by Docker, eg
- name: AIRFLOW__API__AUTH_BACKENDS
value: 'airflow.api.auth.backend.basic_auth'
Attached screenshot for reference
Your help is much appreciated!
YAML file that I use to create my container app:
id: /subscriptions/1234/resourceGroups/<my-res-group>/providers/Microsoft.App/containerApps/<app-name>
identity:
type: None
location: eastus2
name: webservice
properties:
configuration:
activeRevisionsMode: Single
registries: []
managedEnvironmentId: /subscriptions/1234/resourceGroups/<my-res-group>/providers/Microsoft.App/managedEnvironments/container-app-env
template:
containers:
- command:
- /bin/bash
- -c
- exec /entrypoint airflow webserver
env:
- name: AIRFLOW__API__AUTH_BACKENDS
value: 'airflow.api.auth.backend.basic_auth'
- name: AIRFLOW__CELERY__BROKER_URL
value: redis://:#myredis.redis.cache.windows.net:6379/0
- name: AIRFLOW__CELERY__RESULT_BACKEND
value: db+postgresql://user:pass#postres-db-servconn.postgres.database.azure.com/airflow?sslmode=require
- name: AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION
value: 'true'
- name: AIRFLOW__CORE__EXECUTOR
value: CeleryExecutor
- name: AIRFLOW__CORE__SQL_ALCHEMY_CONN
value: postgresql+psycopg2://user:pass#postres-db-servconn.postgres.database.azure.com/airflow?sslmode=require
- name: AIRFLOW__DATABASE__SQL_ALCHEMY_CONN
value: postgresql+psycopg2://user:pass#postres-db-servconn.postgres.database.azure.com/airflow?sslmode=require
- name: AIRFLOW__CORE__LOAD_EXAMPLES
value: 'false'
- name: AIRFLOW_UID
value: 50000
image: docker.io/apache/airflow:latest
name: wsr
volumeMounts:
- volumeName: randaf-azure-files-volume
mountPath: /opt/airflow
probes: []
resources:
cpu: 0.25
memory: 0.5Gi
scale:
maxReplicas: 3
minReplicas: 1
volumes:
- name: randaf-azure-files-volume
storageName: randafstorage
storageType: AzureFile
resourceGroup: RAND
tags:
tagname: ws-only
type: Microsoft.App/containerApps

Reading the medusa_s3_credentials not work on medusa

Describe the bug
Python rises an error during the initialization of medusa container
Environment:
```
apiVersion: v1
kind: Secret
metadata:
name: medusa-bucket-key
type: Opaque
stringData:
medusa_s3_credentials: |-
[default]
aws_access_key_id = xxxxxx
aws_secret_access_key = xxxxxxxx
```
medusa-operator version:
0.12.2
Helm charts version info
apiVersion: v2
name: k8ssandra
type: application
version: 1.6.0-SNAPSHOT
dependencies:
* name: cass-operator
version: 0.35.2
* name: reaper-operator
version: 0.32.3
* name: medusa-operator
version: 0.32.0
* name: k8ssandra-common
version: 0.28.4
Kubernetes version information:
v1.23.1
Kubernetes cluster kind:
EKS
Operator logs:
MEDUSA_MODE = GRPC sleeping for 0 sec Starting Medusa gRPC service INFO:root:Init service [2022-05-10 12:56:28,368] INFO: Init service DEBUG:root:Loading storage_provider: s3 [2022-05-10 12:56:28,368] DEBUG: Loading storage_provider: s3 DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): 169.254.169.254:80 [2022-05-10 12:56:28,371] DEBUG: Starting new HTTP connection (1): 169.254.169.254:80 DEBUG:urllib3.connectionpool:http://169.254.169.254:80 "PUT /latest/api/token HTTP/1.1" 200 56 [2022-05-10 12:56:28,373] DEBUG: http://169.254.169.254:80 "PUT /latest/api/token HTTP/1.1" 200 56 DEBUG:root:Reading AWS credentials from /etc/medusa-secrets/medusa_s3_credentials [2022-05-10 12:56:28,373] DEBUG: Reading AWS credentials from /etc/medusa-secrets/medusa_s3_credentials Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "**main**", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/cassandra/medusa/service/grpc/server.py", line 297, in <module> server.serve() File "/home/cassandra/medusa/service/grpc/server.py", line 60, in serve medusa_pb2_grpc.add_MedusaServicer_to_server(MedusaService(config), self.grpc_server) File "/home/cassandra/medusa/service/grpc/server.py", line 99, in **init** self.storage = Storage(config=self.config.storage) File "/home/cassandra/medusa/storage/**init**.py", line 72, in **init** self.storage_driver = self._connect_storage() File "/home/cassandra/medusa/storage/**init**.py", line 92, in _connect_storage s3_storage = S3Storage(self._config) File "/home/cassandra/medusa/storage/s3_storage.py", line 40, in **init** super().**init**(config) File "/home/cassandra/medusa/storage/abstract_storage.py", line 39, in **init** self.driver = self.connect_storage() File "/home/cassandra/medusa/storage/s3_storage.py", line 78, in connect_storage profile = aws_config[aws_profile] File "/usr/lib/python3.6/configparser.py", line 959, in **getitem** raise KeyError(key) KeyError: 'default'
Which could be the problem?
thanks
Cristian

Why does my Selenium Grid in Azure Container Instances take the same time to execute tests regardless of number of nodes?

Because ACI doesn't support scaling, we deploy multiple container groups containing an Azure DevOps agent, a selenium grid hub and a selenium grid node. To try and speed things up I've tried to deploy the container groups with an additional node, identical to the first only being started on port 6666 instead of port 5555. I can see the two nodes register with the grid without issue but when I execute the same batch of tests with the additional node and without they take the exact same amount of time. How do I go about finding out what's going on here?
My ACI yaml:
apiVersion: 2018-10-01
location: australiaeast
properties:
containers:
- name: devops-agent
properties:
image: __AZUREDEVOPSAGENTIMAGE__
resources:
requests:
cpu: 0.5
memoryInGb: 1
environmentVariables:
- name: AZP_URL
value: __AZUREDEVOPSPROJECTURL__
- name: AZP_POOL
value: __AGENTPOOLNAME__
- name: AZP_TOKEN
secureValue: __AZUREDEVOPSAGENTTOKEN__
- name: SCREEN_WIDTH
value: "1920"
- name: SCREEN_HEIGHT
value: "1080"
volumeMounts:
- name: downloads
mountPath: /tmp/
- name: selenium-hub
properties:
image: selenium/hub:3.141.59-xenon
resources:
requests:
cpu: 1
memoryInGb: 1
ports:
- port: 4444
- name: chrome-node
properties:
image: selenium/node-chrome:3.141.59-xenon
resources:
requests:
cpu: 1
memoryInGb: 2
environmentVariables:
- name: HUB_HOST
value: localhost
- name: HUB_PORT
value: 4444
- name: SCREEN_WIDTH
value: "1920"
- name: SCREEN_HEIGHT
value: "1080"
volumeMounts:
- name: devshm
mountPath: /dev/shm
- name: downloads
mountPath: /home/seluser/downloads
- name: chrome-node-2
properties:
image: selenium/node-chrome:3.141.59-xenon
resources:
requests:
cpu: 1
memoryInGb: 2
environmentVariables:
- name: HUB_HOST
value: localhost
- name: HUB_PORT
value: 4444
- name: SCREEN_WIDTH
value: "1920"
- name: SCREEN_HEIGHT
value: "1080"
- name: SE_OPTS
value: "-port 6666"
volumeMounts:
- name: devshm
mountPath: /dev/shm
- name: downloads
mountPath: /home/seluser/downloads
osType: Linux
diagnostics:
logAnalytics:
workspaceId: __LOGANALYTICSWORKSPACEID__
workspaceKey: __LOGANALYTICSPRIMARYKEY__
volumes:
- name: devshm
emptyDir: {}
- name: downloads
emptyDir: {}
ipAddress:
type: Public
ports:
- protocol: tcp
port: '4444'
#==================== remove this section if not pulling images from private image registries ===============
imageRegistryCredentials:
- server: __IMAGEREGISTRYLOGINSERVER__
username: __IMAGEREGISTRYUSERNAME__
password: __IMAGEREGISTRYPASSWORD__
#========================================================================================================================
tags: null
type: Microsoft.ContainerInstance/containerGroups
When I run my tests locally against a docker selenium grid either from Visual Studio or via dotnet vstest, my tests run in parallel across all available nodes and complete in half the time.

How to inject evnironment variables to driver pod when using spark-on-k8s?

I am writing a Kubernetes Spark Application using GCP spark on k8s.
Currently, I am stuck at not being able to inject environment variables into my container.
I am following the doc here
Manifest:
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: spark-search-indexer
namespace: spark-operator
spec:
type: Scala
mode: cluster
image: "gcr.io/spark-operator/spark:v2.4.5"
imagePullPolicy: Always
mainClass: com.quid.indexer.news.jobs.ESIndexingJob
mainApplicationFile: "https://lala.com/baba-0.0.43.jar"
arguments:
- "--esSink"
- "http://something:9200/mo-sn-{yyyy-MM}-v0.0.43/searchable-article"
- "-streaming"
- "--kafkaTopics"
- "annotated_blogs,annotated_ln_news,annotated_news"
- "--kafkaBrokers"
- "10.1.1.1:9092"
sparkVersion: "2.4.5"
restartPolicy:
type: Never
volumes:
- name: "test-volume"
hostPath:
path: "/tmp"
type: Directory
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
env:
- name: "DEMOGRAPHICS_ES_URI"
value: "somevalue"
labels:
version: 2.4.5
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
executor:
cores: 1
instances: 1
memory: "512m"
env:
- name: "DEMOGRAPHICS_ES_URI"
value: "somevalue"
labels:
version: 2.4.5
volumeMounts:
- name: "test-volume"
mountPath: "/tmp"
Environment Variables set at pod:
Environment:
SPARK_DRIVER_BIND_ADDRESS: (v1:status.podIP)
SPARK_LOCAL_DIRS: /var/data/spark-1ed8539d-b157-4fab-9aa6-daff5789bfb5
SPARK_CONF_DIR: /opt/spark/conf
It turns out to use this one must enable webhooks (how to set up in quick-start guide here)
The other approach could be to use envVars
Example:
spec:
executor:
envVars:
DEMOGRAPHICS_ES_URI: "somevalue"
Ref: https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/issues/978

Python3 logging yaml configuration

I am new in python. I am trying to import logging configuration defined in yaml.
I obtain error:
Traceback (most recent call last):
File "D:/python_3/db_interact/dbInteract.py", line 200, in <module>
logging.config.fileConfig('conf/logging.yaml')
File "C:\Programs\Python\Python36\lib\logging\config.py", line 74, in fileConfig
cp.read(fname)
File "C:\Programs\Python\Python36\lib\configparser.py", line 697, in read
self._read(fp, filename)
File "C:\Programs\Python\Python36\lib\configparser.py", line 1080, in _read
raise MissingSectionHeaderError(fpname, lineno, line)
configparser.MissingSectionHeaderError: File contains no section headers.
file: 'conf/logging.yaml', line: 1
'version: 1\n'
I import configuration using:
logging.config.fileConfig('conf/logging.yaml')
My configuration is:
version: 1
disable_existing_loggers: true
formatters:
simple:
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
handlers:
console:
class: logging.StreamHandler
level: INFO
formatter: simple
stream: ext://sys.stdout
file:
class: logging.FileHandler
level: DEBUG
filename: logs/dbInteract.log
loggers:
simpleExample:
level: DEBUG
handlers: [console]
propagate: no
root:
level: DEBUG
handlers: [console,file]
I use python 3.6.4.
Thanks
According to definition: fileConfig Reads the logging configuration from a configparser-format file. What you supply is yaml-format file.
So You could parse your yaml file to dict obj and then supply it to logging.config.dictConfig(config):
import logging.config
import yaml
with open('./test.yml', 'r') as stream:
config = yaml.load(stream, Loader=yaml.FullLoader)
logging.config.dictConfig(config)

Resources