unable to initialize snowflake data source - python-3.x

I am trying to access the snowflake datasource using "great_expectations" library.
The following is what I tried so far:
from ruamel import yaml
import great_expectations as ge
from great_expectations.core.batch import BatchRequest, RuntimeBatchRequest
context = ge.get_context()
datasource_config = {
"name": "my_snowflake_datasource",
"class_name": "Datasource",
"execution_engine": {
"class_name": "SqlAlchemyExecutionEngine",
"connection_string": "snowflake://myusername:mypass#myaccount/myDB/myschema?warehouse=mywh&role=myadmin",
},
"data_connectors": {
"default_runtime_data_connector_name": {
"class_name": "RuntimeDataConnector",
"batch_identifiers": ["default_identifier_name"],
},
"default_inferred_data_connector_name": {
"class_name": "InferredAssetSqlDataConnector",
"include_schema_name": True,
},
},
}
print(context.test_yaml_config(yaml.dump(datasource_config)))
I initiated great_expectation before executing above code:
great_expectations init
but I am getting the error below:
great_expectations.exceptions.exceptions.DatasourceInitializationError: Cannot initialize datasource my_snowflake_datasource, error: 'NoneType' object has no attribute 'create_engine'
What am I doing wrong?

Your configuration seems to be ok, corresponding to the example here.
If you look at the traceback you should notice that the error propagates starting at the file great_expectations/execution_engine/sqlalchemy_execution_engine.py in your virtual environment.
The actual line where the error occurs is:
self.engine = sa.create_engine(connection_string, **kwargs)
And if you search for that sa at the top of that file:
import sqlalchemy as sa
make_url = import_make_url()
except ImportError:
sa = None
So sqlalchemy is not installed, which you
don't get automatically in your environement if you install greate_expectiations. The thing to do is to
install snowflake-sqlalchemy, since you want to use sqlalchemy's snowflake
plugin (assumption based on your connection_string).
/your/virtualenv/bin/python -m pip install snowflake-sqlalchemy
After that you should no longer get an error, it looks like test_yaml_config is waiting for the connection
to time out.
What worries me greatly is the documented use of a deprecated API of ruamel.yaml.
The function ruamel.yaml.dump is going to be removed in the near future, and you
should use the .dump() method of a ruamel.yaml.YAML() instance.
You should use the following code instead:
import sys
from ruamel.yaml import YAML
import great_expectations as ge
context = ge.get_context()
datasource_config = {
"name": "my_snowflake_datasource",
"class_name": "Datasource",
"execution_engine": {
"class_name": "SqlAlchemyExecutionEngine",
"connection_string": "snowflake://myusername:mypass#myaccount/myDB/myschema?warehouse=mywh&role=myadmin",
},
"data_connectors": {
"default_runtime_data_connector_name": {
"class_name": "RuntimeDataConnector",
"batch_identifiers": ["default_identifier_name"],
},
"default_inferred_data_connector_name": {
"class_name": "InferredAssetSqlDataConnector",
"include_schema_name": True,
},
},
}
yaml = YAML()
yaml.dump(datasource_config, sys.stdout, transform=context.test_yaml_config)
I'll make a PR for great-excpectations to update their documentation/use of ruamel.yaml.

Related

Local hosting python azure function fail with M1

As title, I want to host azure function in local with VSCode but something error.
Python version 3.9.12 (python3).
Azure Functions Core Tools
Core Tools Version: 4.0.4483 Commit hash: N/A (64-bit)
Function Runtime Version: 4.1.3.17473
host.json:
{
"version": "2.0",
"logging": {
"applicationInsights": {
"samplingSettings": {
"isEnabled": true,
"excludedTypes": "Request"
}
}
},
"extensionBundle": {
"id": "Microsoft.Azure.Functions.ExtensionBundle",
"version": "[2.*, 3.0.0)"
}
}
local.setting.json:
{
"IsEncrypted": false,
"Values": {
"FUNCTIONS_WORKER_RUNTIME": "python",
"AzureWebJobsStorage": ""
}
}
Error Message:
Functions:
HttpTrigger1: [GET,POST] http://localhost:7071/api/HttpTrigger1
For detailed output, run func with --verbose flag.
....
[2022-05-09T06:52:10.300Z] from . import dispatcher
[2022-05-09T06:52:10.300Z] File "/opt/homebrew/Cellar/azure-functions-core-tools#4/4.0.4483/workers/python/3.9/OSX/X64/azure_functions_worker/dispatcher.py", line 19, in <module>
[2022-05-09T06:52:10.300Z] import grpc
[2022-05-09T06:52:10.300Z] File "/opt/homebrew/Cellar/azure-functions-core-tools#4/4.0.4483/workers/python/3.9/OSX/X64/grpc/__init__.py", line 23, in <module>
[2022-05-09T06:52:10.300Z] from grpc._cython import cygrpc as _cygrpc
[2022-05-09T06:52:10.300Z] ImportError: dlopen(/opt/homebrew/Cellar/azure-functions-core-tools#4/4.0.4483/workers/python/3.9/OSX/X64/grpc/_cython/cygrpc.cpython-39-darwin.so, 0x0002): tried: '/opt/homebrew/Cellar/azure-functions-core-tools#4/4.0.4483/workers/python/3.9/OSX/X64/grpc/_cython/cygrpc.cpython-39-darwin.so' (mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e')), '/usr/local/lib/cygrpc.cpython-39-darwin.so' (no such file), '/usr/lib/cygrpc.cpython-39-darwin.so' (no such file)
[2022-05-09T06:52:13.512Z] Host lock lease acquired by instance ID '0000000000000000000000008F1C7F2E'.
After reproducing from our end we observed that If you have an arm64 Python, it'll never be able to load an x86_64 shared library hence we need to enable Rosetta which works at a process by process level.
Steps to be followed
Check the Rosetta in iTerm.
Install homebrew, azure functions core tools, and python in the current homebrew.
And then run your azure function.
REFERENCES:
Support running on M1 Macs [Python]

Adding python logging to FastApi endpoints, hosted on docker doesn't display API Endpoints logs

I have a fastapi app on which I want to add python logging. I followed the basic tutorial and added this, however this doesn't add API but just gunicorn logging.
So I have a local server hosted using docker build so running server using docker-compose up and testing my endpoints using api client (Insomnia, similar to postman).
Below is the code where no log file is created and hence no log statements added.
My project str is as follows:
project/
src/
api/
models/
users.py
routers/
users.py
main.py
logging.conf
"""
main.py Main is the starting point for the app.
"""
import logging
import logging.config
from fastapi import FastAPI
from msgpack_asgi import MessagePackMiddleware
import uvicorn
from api.routers import users
logger = logging.getLogger(__name__)
app = FastAPI(debug=True)
app.include_router(users.router)
#app.get("/check")
async def check():
"""Simple health check endpoint."""
logger.info("logging from the root logger")
return {"success": True}
Also, I am using gunicorn.conf that looks like this:
[program:gunicorn]
command=poetry run gunicorn -c /etc/gunicorn/gunicorn.conf.py foodgame_api.main:app
directory=/var/www/
autostart=true
autorestart=true
redirect_stderr=true
And gunicorn.conf.py as
import multiprocessing
bind = "unix:/tmp/gunicorn.sock"
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "uvicorn.workers.UvicornWorker"
loglevel = "debug"
errorlog = "-"
capture_output = True
chdir = "/var/www"
reload = True
reload_engine = "auto"
accesslog = "-"
access_log_format = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"'
This is my output terminal for the above API endpoint on docker:
Could anyone please guide me here? I am new to FastApi so some help will be appreciated.
Inspired by #JPG's answer, but using a pydantic model looked cleaner.
You might want to expose more variables. This config worked good for me.
from pydantic import BaseModel
class LogConfig(BaseModel):
"""Logging configuration to be set for the server"""
LOGGER_NAME: str = "mycoolapp"
LOG_FORMAT: str = "%(levelprefix)s | %(asctime)s | %(message)s"
LOG_LEVEL: str = "DEBUG"
# Logging config
version = 1
disable_existing_loggers = False
formatters = {
"default": {
"()": "uvicorn.logging.DefaultFormatter",
"fmt": LOG_FORMAT,
"datefmt": "%Y-%m-%d %H:%M:%S",
},
}
handlers = {
"default": {
"formatter": "default",
"class": "logging.StreamHandler",
"stream": "ext://sys.stderr",
},
}
loggers = {
LOGGER_NAME: {"handlers": ["default"], "level": LOG_LEVEL},
}
Then import it into your main.py file as:
from logging.config import dictConfig
import logging
from .config import LogConfig
dictConfig(LogConfig().dict())
logger = logging.getLogger("mycoolapp")
logger.info("Dummy Info")
logger.error("Dummy Error")
logger.debug("Dummy Debug")
logger.warning("Dummy Warning")
Which gives:
I would use dict log config
create a logger config as below,
# my_log_conf.py
log_config = {
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"default": {
"()": "uvicorn.logging.DefaultFormatter",
"fmt": "%(levelprefix)s %(asctime)s %(message)s",
"datefmt": "%Y-%m-%d %H:%M:%S",
},
},
"handlers": {
"default": {
"formatter": "default",
"class": "logging.StreamHandler",
"stream": "ext://sys.stderr",
},
},
"loggers": {
"foo-logger": {"handlers": ["default"], "level": "DEBUG"},
},
}
Then, load the config using dictConfig function as,
from logging.config import dictConfig
from fastapi import FastAPI
from some.where.my_log_conf import log_config
dictConfig(log_config)
app = FastAPI(debug=True)
Note: It is recommended to call the dictConfig(...) function before the FastAPI initialization.
After the initialization, you can use logger named foo-logger anywhere in your code as,
import logging
logger = logging.getLogger('foo-logger')
logger.debug('This is test')

Unable to build local AMLS environment with private wheel

I am trying to write a small program using the AzureML Python SDK (v1.0.85) to register an Environment in AMLS and use that definition to construct a local Conda environment when experiments are being run (for a pre-trained model). The code works fine for simple scenarios where all dependencies are loaded from Conda/ public PyPI, but when I introduce a private dependency (e.g. a utils library) I am getting a InternalServerError with the message "Error getting recipe specifications".
The code I am using to register the environment is (after having authenticated to Azure and connected to our workspace):
environment_name = config['environment']['name']
py_version = "3.7"
conda_packages = ["pip"]
pip_packages = ["azureml-defaults"]
private_packages = ["./env-wheels/utils-0.0.3-py3-none-any.whl"]
print(f"Creating environment with name {environment_name}")
environment = Environment(name=environment_name)
conda_deps = CondaDependencies()
print(f"Adding Python version: {py_version}")
conda_deps.set_python_version(py_version)
for conda_pkg in conda_packages:
print(f"Adding Conda denpendency: {conda_pkg}")
conda_deps.add_conda_package(conda_pkg)
for pip_pkg in pip_packages:
print(f"Adding Pip dependency: {pip_pkg}")
conda_deps.add_pip_package(pip_pkg)
for private_pkg in private_packages:
print(f"Uploading private wheel from {private_pkg}")
private_pkg_url = Environment.add_private_pip_wheel(workspace=ws, file_path=Path(private_pkg).absolute(), exist_ok=True)
print(f"Adding private Pip dependency: {private_pkg_url}")
conda_deps.add_pip_package(private_pkg_url)
environment.python.conda_dependencies = conda_deps
environment.register(workspace=ws)
And the code I am using to create the local Conda environment is:
amls_environment = Environment.get(ws, name=environment_name, version=environment_version)
print(f"Building environment...")
amls_environment.build_local(workspace=ws)
The exact error message being returned when build_local(...) is called is:
Traceback (most recent call last):
File "C:\Anaconda\envs\AMLSExperiment\lib\site-packages\azureml\core\environment.py", line 814, in build_local
raise error
File "C:\Anaconda\envs\AMLSExperiment\lib\site-packages\azureml\core\environment.py", line 807, in build_local
recipe = environment_client._get_recipe_for_build(name=self.name, version=self.version, **payload)
File "C:\Anaconda\envs\AMLSExperiment\lib\site-packages\azureml\_restclient\environment_client.py", line 171, in _get_recipe_for_build
raise Exception(message)
Exception: Error getting recipe specifications. Code: 500
: {
"error": {
"code": "ServiceError",
"message": "InternalServerError",
"detailsUri": null,
"target": null,
"details": [],
"innerError": null,
"debugInfo": null
},
"correlation": {
"operation": "15043e1469e85a4c96a3c18c45a2af67",
"request": "19231be75a2b8192"
},
"environment": "westeurope",
"location": "westeurope",
"time": "2020-02-28T09:38:47.8900715+00:00"
}
Process finished with exit code 1
Has anyone seen this error before or able to provide some guidance around what the issue may be?
The issue was with out firewall blocking the required requests between AMLS and the storage container (I presume to get the environment definitions/ private wheels).
We resolved this by updating the firewall with appropriate ALLOW rules for the AMLS service to contact and read from the attached storage container.
Assuming that you'd like to run in the script on a remote compute, then my suggestion would be to pass the environment you just "got". to a RunConfiguration, then pass that to an ScriptRunConfig, Estimator, or a PythonScriptStep
from azureml.core import ScriptRunConfig
from azureml.core.runconfig import DEFAULT_CPU_IMAGE
src = ScriptRunConfig(source_directory=project_folder, script='train.py')
# Set compute target to the one created in previous step
src.run_config.target = cpu_cluster.name
# Set environment
amls_environment = Environment.get(ws, name=environment_name, version=environment_version)
src.run_config.environment = amls_environment
run = experiment.submit(config=src)
run
Check out the rest of the notebook here.
If you're looking for a local run this notebook might help.

Expected 'multiple' syntax before 'single' syntax

I am trying to import files like this but getting error: Expected 'multiple' syntax before 'single' syntax
import { Component, Prop, Vue } from 'vue-property-decorator';
import { getModule } from 'vuex-module-decorators';
import { ApiException, IProduct, IProductCategory } from '#/services'; // error here
import { INavs } from '#/types';
Rule config:
'sort-imports': ['error', {
'ignoreCase': false,
'ignoreDeclarationSort': false,
'ignoreMemberSort': false,
'memberSyntaxSortOrder': ['none', 'all', 'multiple', 'single']
}]
import { ApiException, IProduct, IProductCategory } from '#/services'; is importing multiple (three) exports.
Both import { getModule } from 'vuex-module-decorators'; and import { INavs } from '#/types'; are only importing a single named export.
That error will go way if you move import { ApiException, IProduct, IProductCategory } up one line so it's above the single imports.
This is configured in your settings where it says 'memberSyntaxSortOrder': ['none', 'all', 'multiple', 'single'] because 'multiple' is listed before 'single'.
You can read more about it in the eslint documentation here https://eslint.org/docs/rules/sort-imports
Ran into this too looking for a CI solution for reordering multiple imports, since ESLint --fix only supports multiple members on a single line; i.e. autofixing unsupported when spread over multiple lines - e.g. from a low Prettier printWidth of ~80.
eslint-plugin-simple-import-sort works great, fixed the errors after amending config from the docs:
Make sure not to use other sorting rules at the same time:
[sort-imports]
[import/order]
To resolve this issue I'll recommend modifying your .eslintrc.json setting
"sort-imports": ["error", {
"ignoreCase": false,
"ignoreDeclarationSort": false,
"ignoreMemberSort": false,
"memberSyntaxSortOrder": ["none", "all", "multiple", "single"],
"allowSeparatedGroups": false // <-- Change this to true
}],
Which prevents you from reordering local imports and third party imports that are in different groups but violate the memberSyntaxSortOrder as earlier suggested

Create Stack Instances Parameter Issue

I'm creating stack instance, using python boto3 SDK. According to the documentation I should be able to use ParameterOverrides but I'm getting following error..
botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in input: "ParameterOverrides", must be one of: StackSetName, Accounts, Regions, OperationPreferences, OperationId
Environment :
aws-cli/1.11.172 Python/2.7.14 botocore/1.7.30
imports used
import boto3
import botocore
Following is the code
try:
stackset_instance_response = stackset_client.create_stack_instances(
StackSetName=cloudtrail_stackset_name,
Accounts=[
account_id
],
Regions=[
stack_region
],
OperationPreferences={
'RegionOrder': [
stack_region
],
'FailureToleranceCount': 0,
'MaxConcurrentCount': 1
},
ParameterOverrides=[
{
'ParameterKey': 'CloudtrailBucket',
'ParameterValue': 'test-bucket'
},
{
'ParameterKey': 'Environment',
'ParameterValue': 'SANDBOX'
},
{
'ParameterKey': 'IsCloudTrailEnabled',
'ParameterValue': 'NO'
}
]
)
print("Stackset create Response : " + str(stackset_instance_response))
operation_id = stackset_instance_response['OperationId']
print (operation_id)
except botocore.exceptions.ClientError as e:
print("Stackset creation error : " + str(e))
I'm not sure where I'm doing wrong, any help would be greatly appreciated.
Thank you.
1.8.0 is the first version of Botocore that has parameteroverrides defined.
https://github.com/boto/botocore/blob/1.8.0/botocore/data/cloudformation/2010-05-15/service-2.json#L1087-L1090
1.7.30 doesn't have that defined. https://github.com/boto/botocore/blob/1.7.30/botocore/data/cloudformation/2010-05-15/service-2.json

Resources