How to load elasticsearch in python using sqlalchemy? - python-3.x

Am trying to connect to elasticsearch using below in Jupyter notebook
engine = create_engine("elasticsearch+https://user:pwd#host:9200/")
however it gives the error:
Can't load plugin: sqlalchemy.dialects:elasticsearch.https
Can anyone please help?

TLDR; Simply install elasticsearch-dbapi:
pip install elasticsearch-dbapi
Details:
SQLAlchemy uses "dialects" to support reading and writing to different DBMS's.
SQLAlchemy natively supports MS SQL Server, Oracle, MySQL, Postgres and SQLite as well as others. Here is the full list: https://docs.sqlalchemy.org/en/14/dialects/
Elastic is not in that list. Hence, you need to install a library that delivers the dialect for reading/writing from/to Elastic.

Related

Fb-Prophet, Apache Spark in Colab and AWS SageMaker/ Lambda

I am using Google-Colab for creating a model by using FbProphet and i am try to use Apache Spark in the Google-Colab itself. Now can i upload this Google-colab notebook in aws Sagemaker/Lambda for free (without charge for Apache Spark and only charge for AWS SageMaker)?
In short, You can upload the notebook without any issue into SageMaker. Few things to keep in mind
If you are using the pyspark library in colab and running spark locally, you should be able to do the same by installing necessary pyspark libs in Sagemaker studio kernels. Here you will only pay for the underlying compute for the notebook instance. If you are experimenting then I would recommend you to use https://studiolab.sagemaker.aws/ to create a free account and try things out.
If you had a separate spark cluster setup then you may need a similar setup in AWS using EMR so that you can connect to the cluster to execute the job.

psycopg2 fails on aws glue on subpackage _psycopg

I am trying to get a Glue Spark job running with Python to talk to a Redshift cluster.
But I have trouble getting Psycopg2 to run ... anybody got this going? It complains about a sub-package _psycopg.
Help please! Thanks.
AWs glue has trouble with modules that arent pure python libraries. Try using pg8000 as an alternative
Now with Glue Version 2 you can pass in python libraries as parameters to Glue Jobs. I used pyscopg2-binary instead of pyscopg2 and it worked for me. Then in the code I did import psycopg2.
--additional-python-modules

Migrating Postgres to Redis

Can we migrate directly from postgres to Redis. I was trying with npm package
"postgres-redis" but got stuck.I have huge data stored in Postgres DB in m local, I want this data to be migrated to Redis.How this can be achieved
You can import your Postgres tables into Redis using this tool: https://github.com/Redislabs-Solution-Architects/riot
See the 'Import databases" section of the readme.
You will need to use the Postgres JDBC driver documented here : https://www.postgresql.org/docs/7.4/jdbc-use.html

Methods to connect to AWS Redshift through Python 3.x (Windows7) apart from using psycopg2

Are there any other ways to connect to AWS Redshift through Python 3.x on Windows 7 - 64 bit platform, apart from using psycopg2?
Is psycopg2, the only library which we can use to connect to Redshift?
You can find all the available drivers on the postgres wiki: https://wiki.postgresql.org/wiki/Python
There is one other driver supported on Windows, but in general psycopg2 is your best bet.

Connecting cassandra to python on pycharm

I am new to Cassandra and trying to connect it to python. I use pycharm as my IDE and am trying to connect to a cassandra database on a different server on Pycharm. I tried using datastax but I am reaching several roadblocks.
import cql
con= cql.connect(host="127.0.0.1",port=9160,keyspace="testKS")
This above is the code I have tried but it leads on several errors
Not sure which version of Cassandra you're on, but the newer versions now disable Thrift on port 9160 by default, because the Thrift protocol has been deprecated.
Which driver are you trying to use? If you're following an example, you're probably trying to use a driver that's also been deprecated, due to its dependence on the Thrift model.
You will have much more success using the DataStax Python driver for Cassandra. It installs easily via pip (sudo pip install cassandra-driver) and the getting started guide can get you on the correct path.

Resources