I have a requirement in which I need to connect to oracle database and get some data from db table. So as I am a beginner in aws lambda so i started with below example.
import cx_Oracle
import os
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
logger.info('begin lambda_handler')
dsn = cx_Oracle.makedsn("hostname", "1521", "sid")
con = cx_Oracle.connect("user_id", "password", dsn)
cur = con.cursor()
#logger.info('username: ' + username)
#logger.info('host: ' + host)
sql = """SELECT COUNT(*) AS TEST_COUNT FROM DUAL"""
cur.execute(sql)
columns = [i[0] for i in cur.description]
rows = [dict(zip(columns, row)) for row in cur]
logger.info(rows)
con.close()
logger.info('end lambda_handler')
return "Successfully connected to oracle."
I created below structure in linux for creating deployment package.
Parent_Directory
-lib(Under this library i placed all the oracle instant client files)
-cx_oracle files
lmbda_function.py
after deploying the package on aws lambda and testing it, I get below error.
[ERROR] DatabaseError: DPI-1047: Cannot locate a 64-bit Oracle Client library: "/var/task/lib/libclntsh.so: file too short". See https://oracle.github.io/odpi/doc/installation.html#linux for help
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 29, in lambda_handler
Please somebody help me to achieve this, also if there is any better option to connect with oracle from aws lambda than please share.
I was running into the same problem. After reading this article it looks like you are missing the libaio.so.1 file. I am guessing you downloaded the instant client from a windows machine because from what I understand, the libaio file is not included when downloading onto the windows platform. I was able to download the libaio.so.1 file from this repo. Just make sure to put it in the lib folder with the other oracle instant client files.
Snip up an Ec2 instance
Download cx_Oracle-7.3.0-cp37-cp37m-manylinux1_x86_64.whl - https://pypi.org/project/cx-Oracle/#files
Download Oracle instant client - https://www.oracle.com/database/technologies/instant-client/linux-x86-64-downloads.html (instantclient-basiclite-linux.x64-21.3.0.0.0)
Download lib64aio1-0.3.111-1pclos2018.x86_64.rpm
In Ec2 instance, please follow below steps
sudo yum update
Install pip ( sudo yum install python3-pip) – If pip is not installed. I think by default we will get pip in Ec2. You can check the version by running command “pip3 –version”
The next step is to install Pipenv. A virtual environment is a tool to keep the dependencies required by different projects in separate places, by creating virtual Python environments for them.
sudo pip3 install virtualenv
Virtualenv oraclelambda(some folder name), you will see a directory with name oraclelambda
source oraclelambda/bin/activate
Pip install cx_Oracle-7.3.0-cp37-cp37m-manylinux1_x86_64.whl (You will notice cx_Oracle installed in oraclelambda directory)
sudo yum remove libaio-0.3.109-13.amzn2.0.2.x86_64
sudo yum install lib64aio1-0.3.111-1pclos2018.x86_64.rpm
Next go to usr/lib64 – you will see libaio.so.1.0.1 (copy this file)
Next step extract oracle instant client
Paste the libaio.so.1.0.1 file into instant client folder and rename the libaio.so.1.0.1 file to libaio.so.1
Next step is grab that instant client folder for instance it would be instantclient_21_3 out from the extracted zip and zip it
Create new folder with name python and create the another lib folder inside the python folder.
Next step is copy the python3.7 folder from oraclelambda/lib64 and place it in phthon/lib folder. The folder structure would be python\lib\python3.7\site-packages\
Now zip the python folder
At this point we have two zip files
Instantclient
Python
The above two zip files we are going to update them in lambda layers
Create the lambda, update the zip files into the layers, have proper security groups created.
Please add below to lambda’s environment variables so that it can locate the Oracle Instant Client library.
Key = LD_LIBRARY_PATH
Value = /opt/
I was having a similar issue. It was resolved after I downloaded the Oracle Client library & libaio and bundled it with cx_Oracle while creating a Lambda Layer. You may follow the below video I created for this activity. Hope it is helpful to you - https://www.youtube.com/watch?v=BYiueNog-TI&ab_channel=BorrowedCloud
Related
I have a python script, which checks whether a package is installed or not locally, if it is installed then it upgrades it, or else installs it locally.
installed_packages = [dist.project_name for dist in pkg_resources.working_set]
package_name = 'sigma-cli'
if package_name in installed_packages:
print(package_name + ' is already installed, upgrading the package now\n')
pip.main(['install', '--upgrade', package_name])
else:
print(package_name + ' is not installed, installing it now')
pip.main(['install', package_name])
After installing/upgrading, I am trying to use the package as cli:
detection_query = os.system('C:\\Users\\username\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python310\\Scripts\\sigma.exe convert -p crowdstrike_fdr -t splunk sigma_rule.yml')
I would like to make sure that this script runs in every OS and different machines, not just in my local machine, for that I would like to remove the explicit mention of path in the above command and instead let the script find out the path or location to the installed package, hence is there any way I can achieve this?
Thanks in advance for your help!
The script wrappers get put into a path which is environment and platform-dependent. You can find the installation path with stdlib sysconfig:
import sysconfig
script_dir = sysconfig.get_path("scripts")
Currently i'm working on a school project where I have to use MariaDB in a Python3 assignment. I have to build a Python script that connects to a database and put information into it. So said and done, I have created a Python script:
import psutil
import socket
import mariadb
machine = socket.gethostname()
memory = psutil.virtual_memory()[2]
disk = psutil.disk_usage('/').percent
cpu = psutil.cpu_percent()
print (machine, memory, disk, cpu)
def insert_data(machine, memory, disk, cpu):
try:
conn = mariadb.connect(
user="db_user",
password="welkom01",
host="192.168.0.2",
port=3306,
database="gegevens")
insert_query = """INSERT INTO info (machine, memory, disk, cpu) VALUES (?, ?, ?, ?);"""
verkregen_data = (machine, memory, disk, cpu)
cursor = conn.cursor()
cursor.execute(insert_query, verkregen_data)
cursor.commit()
print ("Total", cursor.rowcount, "Data is succesvol in database gegevens.db geschreven")
conn.commit()
cursor.close()
except mariadb.error as error:
print(f"Error connecting to MariaDB Platform: {error}")
finally:
if (conn):
conn.close()
print("MariaDB connection is closed")
insert_data(machine, memory, disk, cpu)
But now my real issue is starting. I'm working with a Linux CentOS 8 system, where I have to put the script on. I have to install the Python3 plugin MariaDB. But when I try to do so:
Error msg when trying to install
What I have done so far:
-> Installing Mariadb-server
-> Installing the connector from mariaDB's own website: link to own webside
-> Installing the python developer tools: yum -y install openssl-devel bzip2-devel libffi-devel | yum -y groupinstall "Development Tools"
But I can't figure out what i'm doing wrong... Why it won't work. So I hope some of you guys can help me out.
Version informations
You have to download the latest version of MariaDB Connector/C for Cent OS/8:
$ wget https://downloads.mariadb.com/Connectors/c/connector-c-3.1.10/mariadb-connector-c-3.1.10-centos8-amd64.tar.gz
Then you have to extract the package:
$ tar -xzf mariadb-connector-c-3.1.10-centos8-amd64.tar.gz
Copy the bin, lib and include folders to the right destination (either locally somewhere in your home directory or if it should be available to all users and you have the permissions under e.g. /usr/local.
Make sure that your PATH environment variable contains the bin path. You can check this by calling mariadb_config from your konsole:
$ mariadb_config --cc_version
If successfully installed and path is set it should report 3.1.10
If the library directory is not in default place, make sure that your LD_LIBRARY_PATH environment variable contains the directory of MariaDB Connector/C libraries.
Now you can install MariaDB Connector/Python with
pip3 install mariadb
I am trying to connect to my splunk server via Python on my WIndows laptop.
I downloaded splunklib and splunk-sdk. However, when I run
import splunklib.client as client
I get an error of
ModuleNotFoundError: No module named 'splunklib.client'; 'splunklib' is not a package
Any ideas on why this is occurring and suggestions on to how to fix this or the best way to access Splunk via Python?
Did you properly install the splunk-sdk? You would normally use something like pip to install it.
pip install splunk-sdk
Alternatively, you can install it into the PYTHONPATH
Refer to
https://dev.splunk.com/enterprise/docs/python/sdk-python/gettingstartedpython/installsdkpython/
Windows requires a manual setup of the SDK.
Download the Splunk Software Development Kit for Python as a zip file.
Unzip the zip file into the same directory as your program source.
Add the following line to your source code before import splunklib.client as client:
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "splunk-sdk-python-master"))
Another option is to unzip the sdk to another folder and specify the absolute path in the sys.path.insert().
I have an AWS Lambda that I want to connect to my on prem SQL server to read and write data from\to. I am using Python and pyodbc. I have got pyodbc installed (compiled zip file in an S3 bucket added to the lambda through a layer), but when I try and run this code I get an odd error:
import boto3
import pyodbc
s3 = boto3.client('s3')
def lambda_handler(event, context):
# print(help(pyodbc))
server = "Server"
database = "Database"
username = "AWS-Lamdba-RO"
password = "Password"
cnxn = pyodbc.connect('DRIVER={ODBC Driver 13 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
cursor = cnxn.cursor()
This is the error:
[ERROR] AttributeError: module 'pyodbc' has no attribute 'connect' Traceback (most recent call last): File "/var/task/lambda_function.py", line 13, in lambda_handler cnxn = pyodbc.connect('DRIVER={ODBC Driver 13 for SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
All I'm finding online is people who are unable to get the pyodbc library installed in the first place, so having got past that sticking point I thought I was free and clear. Can anyone explain what I've run into now?
I got pyodbc from here:
https://github.com/Miserlou/lambda-packages/tree/master/lambda_packages/pyodbc
AWS didn't recognise .tar.gz files, so I changed it to a zip file and also added in the folder structure which another googled site told me was necessary:
\python\lib\python3.7\site-packages\pyodbc
that folder contains:
libodbc.so.2
pyodbc.so
I uploaded this Zip file to an S3 bucket and pointed a Lambda layer at it.
Have I done something silly with this?
Notable Points
The environment in which you build the layer should be the same as your Lambda function runtime environment. ie. if you build the package in a python3.7 environment then lambda should be launched with the python 3.7 runtime.
Guidance from AWS Support
pyodbc is distributed in a shared lib format (which can be a little challenging to compile for lambda). It also requires other shared libs such as unixODBC (connector which the wrapper works around) and also the database drivers (in this case we will be using msodbcsql17).
The folder structure for this layer should look like this:
|-- pyodbc-layer
|-- bin
|-- include
|-- lib
|-- odbc.ini
|-- ODBCDataSources
|-- odbcinst.ini
|-- python
|-- pyodbc-4.0.26.dist-info
|-- pyodbc.cpython-37m-x86_64-linux-gnu.so
|-- share
In order to generate this layer, you need to execute the following steps:
create an EC2 Ubuntu 18.04 LTS instance (t2.micro is fine) and SSH into it.
Install docker using snap with the following command:
sudo snap install docker
Run the following command to create a container based on amazon linux with python3.7 as it's environment. Keep in mind that you can change to python 3.6 just by changing build-python3.7 to build-python3.6.
sudo docker run -it --rm -iv${PWD}:/host-volume --entrypoint bash -e ODBCINI=/opt/odbc.ini -e ODBCSYSINI=/opt/ lambci/lambda:build-python3.7
When you fist run this command, docker will download the amazon linux image from dockerhub (it may take 30-60 seconds to download and unpack it). After you download it, you will enter the docker's image bash
(In case you weren't redirected to the dockers bash, just run the command again)
After you are in the docker's bash, copy and paste the following commands to your docker:
curl ftp://ftp.unixodbc.org/pub/unixODBC/unixODBC-2.3.7.tar.gz -O
tar xzvf unixODBC-2.3.7.tar.gz
cd unixODBC-2.3.7
./configure --sysconfdir=/opt --disable-gui --disable-drivers --enable-iconv --with-iconv-char-enc=UTF8 --with-iconv-ucode-enc=UTF16LE --prefix=/opt
make
make install
cd ..
rm -rf unixODBC-2.3.7 unixODBC-2.3.7.tar.gz
curl https://packages.microsoft.com/config/rhel/6/prod.repo > /etc/yum.repos.d/mssql-release.repo
yum install e2fsprogs.x86_64 0:1.43.5-2.43.amzn1 fuse-libs.x86_64 0:2.9.4-1.18.amzn1 libss.x86_64 0:1.43.5-2.43.amzn1
ACCEPT_EULA=Y yum install msodbcsql17 --disablerepo=amzn*
export CFLAGS="-I/opt/include"
export LDFLAGS="-L/opt/lib"
cd /opt
cp -r /opt/microsoft/msodbcsql17/ .
rm -rf /opt/microsoft/
mkdir /opt/python/
cd /opt/python/
pip install pyodbc -t .
cd /opt
cat <<EOF > odbcinst.ini
[ODBC Driver 17 for SQL Server]
Description=Microsoft ODBC Driver 17 for SQL Server
Driver=/opt/msodbcsql17/lib64/libmsodbcsql-17.7.so.2.1
UsageCount=1
EOF
cat <<EOF > odbc.ini
[ODBC Driver 17 for SQL Server]
Driver = ODBC Driver 17 for SQL Server
Description = My ODBC Driver 17 for SQL Server
Trace = No
EOF
cd /opt
zip -r9 ~/pyodbc-layer.zip .
(In case you get errors related to "read-only file system" just exit the docker shell by using the command "exit" and try the steps again)
Now make sure you exit your container using the command:
exit
Now the file will be available in the folder "/home/ubuntu". You can upload it using AWS CLI (if you have it configured) or retrieve via Sftp. In case you are going to use cyberduck/winscp/fillezilla or even sftp shell, you will need to change permissions on the file in order to download it.
The fastest way to give the file permission to be downloaded is using the following line:
sudo chmod o+rw pyodbc-layer.zip
Now you can retrieve it using sftp or any sftp-compatible client.
From your description, I believe you may have gotten your folder structure wrong.
If you were to look in your zip file, you should see the following structure:
layerZip.zip
└ python
└ lib
└ python3.7
└ site-packages
└ pyodbc
But I believe you may actually have
layerZip.zip
└ \
└ python
└ lib
└ python3.7
└ site-packages
└ pyodbc
But honestly, just use this structure:
layerZip.zip
└ python
└ pyodbc
It'll work just as well, it's just setting the module up as global instead of per user.
I had the same issue. You must use correct version of python when building the package to upload to lambda.
I did a 'pip install pyodbc -t .'
It placed the following:
drwxr-xr-x 2 root root 4096 Sep 20 21:29 pyodbc-4.0.27.dist-info
-rwxr-xr-x 1 root root 658704 Sep 20 21:29 pyodbc.cpython-36m-x86_64-linux-gnu.so
The lib is very specific to the version of python. In the file name,'-36m-', so it will work the lambda python 3.6 environment.
My initial issue was I was using lambci/lambda:build-python3.7 for my docker environment. So pyodbc installed the python 3.7 version of the library, '-37m-'.
Since lambda was looking for the 3.6 version it did not see the 3.7 version.
I switched to using lambci/lambda:build-python3.6 and all was better.
I followed this article for getting everything working:
https://medium.com/faun/aws-lambda-microsoft-sql-server-how-to-66c5f9d275ed
I've followed the official installation guide but haven't had any luck so far. I wonder if cx_Oracle can work on AWS SageMaker's virtual environment. The steps I've used so far are:
Create a /opt/oracle directory and unzip the basic instantclient in it.
sudo yum install libaio
sudo sh -c "echo /opt/oracle/instantclient_18_3 > /etc/ld.so.conf.d/oracle-instantclient.conf" and
sudo ldconfig
And finally exported the LD_LIBRARY_PATH with: export LD_LIBRARY_PATH=/opt/oracle/instantclient_18_3:$LD_LIBRARY_PATH
When trying to run a connection inside the notebook with connection = cx_Oracle.connect(usr + '/' + pwd + '#' + url), I receive the DPI-1047 error code that says that libclntsh.so cannot be open, however that library is in the /opt/oracle folder. As another option, when running the same connection through the terminal Python console, I get the ORA-01804 error code, which says that the timezone files were not properly read, which is something I'm trying to fix also but suspect is related to cx_Oracle not finding its library folder. (Now, explain to me: why does it have to be so difficult for a billionaire company to create a decent library import and installation?)
Is there a step I'm missing? Is there a detail about AWS SageMaker that I should account for? Also, is there another option for extracting data from an oracle server through Python and AWS?
Hi and thank you for using SageMaker!
After some effort, I was finally able to figure out a sequence of steps that allowed me to query an Oracle 12 database from within a SageMaker notebook instance. Here are the steps that I took:
I created an Oracle 12 database using Amazon RDS for testing purposes. (You can of course skip this step if you already have an Oracle database available.)
I downloaded the Oracle 12 Instant Client RPM as described here. Note that you will need an Oracle Account in order to download this file.
I uploaded the RPM to my SageMaker Notebook Instance from within JupyterLab. Note that this can take 2-3 minutes to completely upload before proceeding to the next step. (I initially had problems running the installation because the upload was still in progress.)
I ran all of the following commands from the Jupyter terminal as prescribed in the Oracle instructions:
cd SageMaker
sudo yum install oracle-instantclient12.2-basic-12.2.0.1.0-1.x86_64.rpm
sudo sh -c "echo /usr/lib/oracle/12.2/client64/lib > /etc/ld.so.conf.d/oracle-instantclient.conf"
sudo ldconfig
sudo mkdir -p /usr/lib/oracle/12.2/client64/lib/network/admin
# Restart Jupyter...
sudo restart jupyter-server
I then installed the cx_Oracle library:
source activate python3
pip install cx_Oracle --upgrade
Then lastly, I created a new notebook using the conda_python3 kernel:
import cx_Oracle
db = cx_Oracle.connect("my_username/my_password#my-rds-instance.ccccccccccc.us-east-1.rds.amazonaws.com/ORCL")
# Example query
cursor = db.cursor()
for row in cursor.execute('select * from DBA_TABLES'):
print(row)
...and then it worked!
Note that it took me a few tries to figure out the exact connection string, which can differ depending on how your database is configured. Unfortunately, the error messages were hard to understand - in my case, I had the error ORA-12504: TNS:listener was not given the SERVICE_NAME in CONNECT_DATA until I had specified the /ORCL at the end of the connection string.
If you need to do these steps frequently, you can add the installation and configuration of the Oracle client in a SageMaker Lifecycle Configuration script. I haven't tested that scenario, but it might be worth a try!
One last thing, I noticed in your question that you are using the Oracle 18 client. I didn't test that exact scenario, since I only have access to an Oracle 12 database. However, the Oracle 12 client should be able to connect to an Oracle 18 database, too.
Best,
Kevin