How to use two databases: Postgres and Snowflake with alembic and sqlachemy? - python-3.x

I want to use two databases: Postgres and Snowflake using alembic migration tool in a single fastapi app.
Able to perform alembic migrations and alembic upgrade if using single database i.e. Postgres but on using multiple databases alembic is creating problems.
Tried using snowflake database independently on a different app with only snowflake as a database. It is working perfectly fine with revisions and alembic upgrade but not working on a single app with two databases.
Here is my directory structure as suggested in few articles:
main-project-->
postgres-migration-->
versions/
env.py
README
script.py.mako
snowflake-migration-->
versions/
env.py
README
script.py.mako
Here is my alembic.ini file generated for postgres, but I manipulated it to support snowflake:
# A generic, single database configuration.
[alembic]
# path to migration scripts
script_location = db-migration
[A_SNOWFLAKE_SCHEMA]
# path to env.py and migration scripts for schema1
script_location = snowflake-db-migration
# template used to generate migration files
# file_template = %%(rev)s_%%(slug)s
# sys.path path, will be prepended to sys.path if present.
# defaults to the current working directory.
prepend_sys_path = .
# timezone to use when rendering the date
# within the migration file as well as the filename.
# string value is passed to dateutil.tz.gettz()
# leave blank for localtime
# timezone =
# max length of characters to apply to the
# "slug" field
# truncate_slug_length = 40
# set to 'true' to run the environment during
# the 'revision' command, regardless of autogenerate
# revision_environment = false
# set to 'true' to allow .pyc and .pyo files without
# a source .py file to be detected as revisions in the
# versions/ directory
# sourceless = false
# version location specification; this defaults
# to db-migration/versions. When using multiple version
# directories, initial revisions must be specified with --version-path
# version_locations = %(here)s/bar %(here)s/bat db-migration/versions
# the output encoding used when revision files
# are written from script.py.mako
# output_encoding = utf-8
#sqlalchemy.url = driver://user:pass#localhost/dbname
[post_write_hooks]
# post_write_hooks defines scripts or Python functions that are run
# on newly generated revision scripts. See the documentation for further
# detail and examples
# format using "black" - use the console_scripts runner, against the "black" entrypoint
# hooks = black
# black.type = console_scripts
# black.entrypoint = black
# black.options = -l 79 REVISION_SCRIPT_FILENAME
# Logging configuration
[loggers]
keys = root,sqlalchemy,alembic
[handlers]
keys = console
[formatters]
keys = generic
[logger_root]
level = WARN
handlers = console
qualname =
[logger_sqlalchemy]
level = WARN
handlers =
qualname = sqlalchemy.engine
[logger_alembic]
level = INFO
handlers =
qualname = alembic
[handler_console]
class = StreamHandler
args = (sys.stderr,)
level = NOTSET
formatter = generic
[formatter_generic]
format = %(levelname)-5.5s [%(name)s] %(message)s
datefmt = %H:%M:%S
Above file is able to create revision scripts inside versions/ folder of snowflake-db-migration. But not able to run the generated revision using(throwing errors):
alembic upgrade head
How to convert this .ini file to support multi-database .ini file?
Tried using command:
alembic init --template multidb snowflake-db-migration
But not sure how to integrate changes here.
Also don't want to hardcode sqlalchemy.url variable in .ini file as required to use snowflake variables from a .toml file which contains all environment variables.
There are few answers but none of them are accepted and exactly similar to my use case.

Related

SonarQube( SAST SCAN) log injection hotspot issue

I have written code to add logs using logging module in python. I tried running code through Sonarqube, It is showing following error .
Make sure that this logger's configuration is safe.
python code:
from logging.config import fileConfig
import logging
#this is the Alembic Config object, which provides
# access to the values within the .ini file in use.
config = context.config
# Interpret the config file for Python logging.
# This line sets up loggers basically.
fileConfig(config.config_file_name)
logger = logging.getLogger("alembic.env")
class DefaultConfig:
DEVELOPMENT = False
DEBUG = False
TESTING = False
LOGGING_LEVEL = "DEBUG"
CSRF_ENABLED = True
Please help to resolve this hotspot. And one more question I have, Is it mandatory to look into low priority hotspots.

Tomcat is generating logs in multiple places, one in the default path "/logs" and another in the custom directory that is specified externally

We are planning to rotate the log that is generated by Tomcat using Logrotate for volume maintenance. When I checked for the logs I was able to find two places in which these logs were been generated "../apache-tomcat-7.0.57/logs" and in the path that is specified in the "logging.properties". I did check in the Tomcat document, from which I was able to understand that Tomcat uses the default path which is "/logs" is no path is mentioned externally in "logging.properties". I was not able to find if I have missed any configuration.
logging.properties file:
handlers = 1catalina.org.apache.juli.FileHandler, 2localhost.org.apache.juli.FileHandler, 3manager.org.apache.juli.FileHandler, 4host-manager.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler
.handlers = 1catalina.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler
############################################################
# Handler specific properties.
# Describes specific configuration info for Handlers.
############################################################
1catalina.org.apache.juli.FileHandler.level = FINE
1catalina.org.apache.juli.FileHandler.directory = <custome path>
1catalina.org.apache.juli.FileHandler.prefix = catalina.
2localhost.org.apache.juli.FileHandler.level = FINE
2localhost.org.apache.juli.FileHandler.directory = <custome path>
2localhost.org.apache.juli.FileHandler.prefix = localhost.
3manager.org.apache.juli.FileHandler.level = FINE
3manager.org.apache.juli.FileHandler.directory = <custome path>
3manager.org.apache.juli.FileHandler.prefix = manager.
4host-manager.org.apache.juli.FileHandler.level = FINE
4host-manager.org.apache.juli.FileHandler.directory = <custome path>
4host-manager.org.apache.juli.FileHandler.prefix = host-manager.
java.util.logging.ConsoleHandler.level = FINE
java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter
############################################################
# Facility specific properties.
# Provides extra control for each logger.
############################################################
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].handlers = 2localhost.org.apache.juli.FileHandler
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].handlers = 3manager.org.apache.juli.FileHandler
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/host-manager].handlers = 4host-manager.org.apache.juli.FileHandler
# For example, set the org.apache.catalina.util.LifecycleBase logger to log
# each component that extends LifecycleBase changing state:
#org.apache.catalina.util.LifecycleBase.level = FINE
# To see debug messages in TldLocationsCache, uncomment the following line:
#org.apache.jasper.compiler.TldLocationsCache.level = FINE
My question is why are the logs getting generated in multiple places and how to make it log in just in one directory for maintaining the same ?
Reference link
https://tomcat.apache.org/tomcat-7.0-doc/logging.html
By default - It'll log to ${catalina.base}/logs which is what you should see in ${catalina.base}/conf/logging.properties
Additionally standard output (aka exception.printStackTrace()) goes into (by default) ${catalina.base}/logs/catalina.out
${catalina.base}/logs/catalina.out can be set to a different file by setting the environment variable CATALINA_OUT or CATALINA_OUT_CMD. So see what CATALINA_OUT_CMD does - It'll be easier to read the comments in ${catalina.home}/bin/catalina.sh

How to get Sphinx to recognize pandas as a module

I am using Python 3.9.2 with sphinx-build 3.5.2. I have created a project with the following directory and file structure
core_utilities
|_core_utilities
| |_ __init__.py
| |_read_files.py
|_docs
|_ sphinx
|_Makefile
|_conf.pg
|_source
| |_conf.py
| |_index.rst
| |_read_files.rst
|_build
The read_files.py file contains the following code. NOTE: I simplified this example so it would not have distracting information. Within this code their is one class that contains one member function to read a text file, look for a key word and read the variable to the right of the keyword as a numpy.float32 variable. The stand-alone function written to read in a csv file with specific data types and save it to a pandas dataframe.
# Import packages here
import os
import sys
import numpy as np
import pandas as pd
from typing import List
class ReadTextFileKeywords:
def __init__(self, file_name: str):
self.file_name = file_name
if not os.path.isfile(file_name):
sys.exit('{}{}{}'.format('FATAL ERROR: ', file_name, ' does not exist'))
# ----------------------------------------------------------------------------
def read_double(self, key_words: str) -> np.float64:
values = self.read_sentence(key_words)
values = values.split()
return np.float64(values[0])
# ================================================================================
# ================================================================================
def read_csv_columns_by_headers(file_name: str, headers: List[str],
data_type: List[type],
if not os.path.isfile(file_name):
sys.exit('{}{}{}'.format('FATAL ERROR: ', file_name, ' does not exist'))
dat = dict(zip(headers, data_type))
df = pd.read_csv(file_name, usecols=headers, dtype=dat, skiprows=skip)
return df
The conf.py file as the following information in it.
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys
sys.path.insert(0, os.path.abspath('../../../core_utilities'))
# -- Project information -----------------------------------------------------
project = 'Core Utilities'
copyright = 'my copyright'
author = 'my name'
# The full version, including alpha/beta/rc tags
release = '0.1.0'
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = ['sphinx.ext.todo', 'sphinx.ext.viewcode', 'sphinx.ext.autodoc',
'sphinx.ext.autosummary', 'sphinx.ext.githubpages']
autodoc_member_order = 'groupwise'
autodoc_default_flags = ['members', 'show-inheritance']
autosummary_generate = True
autodock_mock_imports = ['numpy']
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'nature'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
The index.rst file has the following information.
. Core Utilities documentation master file, created by
sphinx-quickstart
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to Core Utilities's documentation!
==========================================
.. toctree::
:maxdepth: 2
:caption: Contents:
read_files
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
and the read_files.rst file has the following information
**********
read_files
**********
The ``read_files`` module contains methods and classes that allow a user to read different
types of files in different ways. Within the module a user will find functionality that
allows them to read text files and csv files by columns and keywords. This module
also contains functions to read sqlite databases, xml files, json files, yaml files,
and html files
.. autoclass:: test.ReadTextFileKeywords
:members:
.. autofunction:: test.read_csv_columns_by_headers
In addition, I am using a virtual environment that is active when I try to compile this with command line. I run the following command to compile html files with sphinx.
sphinx-build -b html source build
When I run the above command it fails with the following error.
WARNING: autodoc: failed to import class 'ReadTextFileKeywords' from module 'read_files'; the following exception was raised; No module named pandas.
If I delete the line from pandas import pd and then delete the function read_csv_columns_by_headers along with the call to the function in the read_files.rst file, everything compiles fine. It appears that for some reason sphinx is able to find numpy, but it for some reason does not seem to recognize pandas, although both exist in the virtual environment and were loaded with a pip3 install statement. Does anyone know why sphinx is able to find other modules, but not pandas?
Add pandas to autodoc
autodoc_mock_imports = ['numpy','pandas']

Sphinx cannot find pandas

I am trying to document a Python class using sphinx. I am used Python 3.9.2 and sphinx-build 3.5.2. I am running the command sphinx-build -b html source build to transform the .rst files into .html files. I have already successfully executed this configuration on other classes and functions in the code-suite, so I know their is nothing wrong with the way I have organized my files. When I run the command I get the following error;
Warning: autodoc: failed to import class 'ReadTextFileKeywords' from module 'read_files'; the following exception was raised: No module named pandas
I do have the most recent version of pandas loaded globally and in my virtual environment. Does anyone know why it cannot load pandas and how I can fix this issue?
For reference, the file structure looks like this;
core_utilities
|_core_utilities
|_docs
|_sphinx
|_Makefile
|_build
| |_html files
|_source
|_index.rst
|_Introduction.rst
|_read_files.rst
|_operating_system.rst
|_conf.py
The conf.py file has the following information
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys
sys.path.insert(0, os.path.abspath('../../../core_utilities'))
# -- Project information -----------------------------------------------------
project = 'Core Utilities'
copyright = '2021, Jonathan A. Webb'
author = 'Jonathan A. Webb'
# The full version, including alpha/beta/rc tags
release = '0.1.0'
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = ['sphinx.ext.todo', 'sphinx.ext.viewcode', 'sphinx.ext.autodoc',
'sphinx.ext.autosummary', 'sphinx.ext.githubpages']
autodoc_member_order = 'groupwise'
autodoc_default_flags = ['members', 'show-inheritance']
autosummary_generate = True
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'nature'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
and the index file has the following format
.. Core Utilities documentation master file, created by
sphinx-quickstart
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to Core Utilities's documentation!
==========================================
.. toctree::
:maxdepth: 2
:caption: Contents:
Introduction
operating_system
read_files
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
Sphinx works just fine when I import docstrings from the operating_system module, so I know that is not the problem. It fails when I try to import from the read_files module. The read_files.rst file has hte following format.
**********
read_files
**********
The ``read_files`` module contains methods and classes that allow a user to read different
types of files in different ways. Within the module a user will find functionality that
allows them to read text files and csv files by columns and keywords. This module
also contains functions to read sqlite databases, xml files, json files, yaml files,
and html files
.. autoclass:: read_files.ReadTextFileKeywords
:members:
And the class it is reading, to include the file has the following format
# Import packages here
import os
import sys
import numpy as np
import pandas as pd
from typing import List
import sqlite3
# ================================================================================
# ================================================================================
# Date: Month Day, Year
# Purpose: Describe the purpose of functions of this file
# Source Code Metadata
__author__ = "Jonathan A. Webb"
__copyright__ = "Copyright 2021, Jon Webb Inc."
__version__ = "1.0"
# ================================================================================
# ================================================================================
# Insert Code here
class ReadTextFileKeywords:
"""
A class to find keywords in a text file and the the variable(s)
to the right of the key word. This class must inherit the
``FileUtilities`` class
:param file_name: The name of the file being read to include the
path-link
For the purposes of demonstrating the use of this class, assume
a text file titled ``test_file.txt`` with the following contents.
.. code-block:: text
sentence: This is a short sentence!
float: 3.1415 # this is a float comment
double: 3.141596235941 # this is a double comment
String: test # this is a string comment
Integer Value: 3 # This is an integer comment
float list: 1.2 3.4 4.5 5.6 6.7
double list: 1.12321 344.3454453 21.434553
integer list: 1 2 3 4 5 6 7
"""
def __init__(self, file_name: str):
self.file_name = file_name
if not os.path.isfile(file_name):
sys.exit('{}{}{}'.format('FATAL ERROR: ', file_name, ' does not exist'))

How to add a module folder /tar.gz to nodes in Pyspark

I am running pyspark in Ipython Notebook after doing following configuration
export PYSPARK_DRIVER_PYTHON=/usr/local/bin/jupyter
export PYSPARK_DRIVER_PYTHON_OPTS="notebook--NotebookApp.open_browser=False --NotebookApp.ip='*' --NotebookApp.port=8880"
export PYSPARK_PYTHON=/usr/bin/python
I am having a custom udf function, which makes use of a module called mzgeohash. But, I am getting module not found error, I guess this module might be missing in workers / nodes .I tried to add sc.addpyfile and all. But, what will be the effective way to add a cloned folder or tar.gz python module in this case , from Ipython .
Here is how I do it, basically the idea is to create a zip of all the files in your module and pass it to sc.addPyFile() :
import dictconfig
import zipfile
def ziplib():
libpath = os.path.dirname(__file__) # this should point to your packages directory
zippath = '/tmp/mylib-' + rand_str(6) + '.zip' # some random filename in writable directory
zf = zipfile.PyZipFile(zippath, mode='w')
try:
zf.debug = 3 # making it verbose, good for debugging
zf.writepy(libpath)
return zippath # return path to generated zip archive
finally:
zf.close()
...
zip_path = ziplib() # generate zip archive containing your lib
sc.addPyFile(zip_path) # add the entire archive to SparkContext
...
os.remove(zip_path) # don't forget to remove temporary file, preferably in "finally" clause

Resources