How to access generated output file in Docker - python-3.x

I have dockerized my python application. This application connects with Oracle database, pull out 10 rows from a table and then generate excel. I was able to build my image successfully with all dependent libraries and it's also executing fine. Now, I am not sure how to get generated excel file (batchtable.xlsx) in docker.
I am new to docker and would need your suggestion. I have checked output without storing records into excel and it's coming fine on console, so there is no code issue.
Dockerfile
FROM python:3.7.4-slim-buster
RUN apt-get update && apt-get install -y libaio1 wget unzip
WORKDIR /opt/oracle
COPY File.py /opt/oracle
RUN wget https://download.oracle.com/otn_software/linux/instantclient/instantclient-basiclite-linuxx64.zip && \
unzip instantclient-basiclite-linuxx64.zip && rm -f instantclient-basiclite-linuxx64.zip && \
cd /opt/oracle/instantclient* && rm -f *jdbc* *occi* *mysql* *README *jar uidrvci genezi adrci && \
echo /opt/oracle/instantclient* > /etc/ld.so.conf.d/oracle-instantclient.conf && ldconfig
RUN python -m pip install --upgrade pip
RUN python -m pip install cx_Oracle
RUN python -m pip install pandas
RUN python -m pip install openpyxl
CMD [ "python", "/opt/oracle/File.py" ]
File.py
import cx_Oracle
import pandas as pd
#creating database connection
dsn_tns = cx_Oracle.makedsn('dev-tr01.com', '1222', service_name='ast041.com')
conn = cx_Oracle.connect(user=r'usr', password='3451', dsn=dsn_tns)
c = conn.cursor()
query ='SELECT * FROM Employee WHERE ROWNUM <10'
result = pd.read_sql(query, con=conn)
result.to_excel("batchtable.xlsx")
conn.close()

You can access your data by mounting a volume into your container e.g.
docker run -ti -v $(pwd):/data IMAGE
https://docs.docker.com/storage/volumes/#start-a-container-with-a-volume

Add a -v switch to your docker run command. For instance:
docker run -v <path>:/output YOUR_IMAGE_NAME
Replace path with a valid path on your machine, for instance c:\temp on Windows.
Change your program to write to that directory:
result.to_excel("/output/batchtable.xlsx")
If running Docker Desktop, in the Docker Desktop settings make sure your drive is shared.

Related

Why is this container missing one file in its volume mount?

Title is the question.
I'm hosting many docker containers on a rather large linux ec2 instance. One container in particular needs access to a file that gets transferred to the host before run time. The file in question is copied from a windows file server to the ec2 instance using control-m.
When the container image runs, we give it -v to specify a volume mount with a path on the host to that transferred file.
The file is not found in the container. If I make a new file in the container, the new file appears on the host. When I make a file on the host, it appears in the container. When I make a copy of the transferred file using cp -p the copied file DOES show up in the container, but the original still does not.
I don't understand why this is? My suspicion is something to do with it being on a windows server before control-m copies it to the ec2 instance.
Details:
The file lives in the path /folder_path/project_name/resources/file.txt
Its permissions are -rwxrwxr-x 1 pyadmin pyadmin where pyadmin maps to the containers root user.
It's approximately 38mb in size and when I run file file.txt I get the output ASCII text, with CRLF line terminators.
The repo also has a resources folder with files already in it when it is cloned, but none of their names conflict.
Docker Version: 20.10.13
Dockerfile:
FROM python:3.9.11-buster
SHELL ["/bin/bash", "-c"]
WORKDIR /folder_path/project_name
RUN apt-get auto-clean && apt-get update && apt-get install -y unixodbc unixodbc-dev && apt-get upgrade -y
RUN python -m pip install --upgrade pip poetry
COPY . .
RUN python -m pip install --upgrade pip poetry && \
poetry config virtualenvs.create false && \
poetry install
ENTRYPOINT [ "python" ]
Command to start container:
docker run --pull always --rm \
-v /folder_path/project_name/logs:/folder_path/project_name/logs \
-v /folder_path/project_name/extracts:/folder_path/project_name/extracts \
-v /folder_path/project_name/input:/folder_path/project_name/input \
-v /folder_path/project_name/output:/folder_path/project_name/output \
-v /folder_path/project_name/resources:/folder_path/project_name/resources \
my-registry.com/folder_path/project_name:image_tag

Unable to run aliyun-cli in Docker:stable container after installing it. Errors as command not found

I am unsure if stack overflow or system fault is the right stack exchange site but I'm going with stack overflow cause the alicloud site posted to add a tag and ask a question here.
So. I'm currently building an image based on Docker:stable, that is an alpine distro, that will have aliyun-cli installed and available for use. However I am getting a weird error of Command Not Found when I'm running it. I have followed the guide here https://partners-intl.aliyun.com/help/doc-detail/139508.htm and moved the aliyun binary to /usr/sbin
Here is my Dockerfile for example
FROM docker:stable
RUN apk update && apk add curl
#Install python 3
RUN apk update && apk add python3 py3-pip
#Install AWS Cli
RUN pip3 install awscli --upgrade
# Install Aliyun CLI
RUN curl -L -o aliyun-cli.tgz https://aliyuncli.alicdn.com/aliyun-cli-linux-3.0.30-amd64.tgz
RUN tar -xzvf aliyun-cli.tgz
RUN mv aliyun /usr/bin
RUN chmod +x /usr/bin/aliyun
RUN rm aliyun-cli.tgz
However when i'm running aliyun (which can be auto-completed) I am getting this
/ # aliyun
sh: aliyun: not found
I've tried moving it to other bins. Cding into the folder and calling it explicitly but still always getting a command not found. Any suggestions would be welcome.
Did you check this Dockerfile?
Also why you need to install aws-cli in the same image and why you will need to maintain it for your self when AWS provide managed aws-cli image.
docker run --rm -it amazon/aws-cli --version
that's it for aws-cli image,but if you want in existing image then you can try
RUN pip install awscli --upgrade
DockerFile
FROM python:2-alpine3.8
LABEL com.frapsoft.maintainer="Maik Ellerbrock" \
com.frapsoft.version="0.1.0"
ARG SERVICE_USER
ENV SERVICE_USER ${SERVICE_USER:-aliyun}
RUN apk add --no-cache curl
RUN curl https://raw.githubusercontent.com/ellerbrock/docker-collection/master/dockerfiles/alpine-aliyuncli/requirements.txt > /tmp/requirements.txt
RUN \
adduser -s /sbin/nologin -u 1000 -H -D ${SERVICE_USER} && \
apk add --no-cache build-base && \
pip install aliyuncli && \
pip install --no-cache-dir -r /tmp/requirements.txt && \
apk del build-base && \
rm -rf /tmp/*
USER ${SERVICE_USER}
WORKDIR /usr/local/bin
ENTRYPOINT [ "aliyuncli" ]
CMD [ "--help" ]
build and run
docker build -t aliyuncli .
docker run -it --rm aliyuncli
output
docker run -it --rm abc aliyuncli
usage: aliyuncli <command> <operation> [options and parameters]
<aliyuncli> the valid command as follows:
batchcompute | bsn
bss | cms
crm | drds
ecs | ess
ft | ocs
oms | ossadmin
ram | rds
risk | slb
ubsms | yundun
After a lot of lookup I found a github issue in the official aliyun-cli that sort of describes that it is not compatible with alpine linux because of it's not muslc compatible.
Link here: https://github.com/aliyun/aliyun-cli/issues/54
Following the workarounds there I build a multi-stage docker file with the following that simply fixed my issue.
Dockerfile
#Build aliyun-cli binary ourselves because of issue
#in alpine https://github.com/aliyun/aliyun-cli/issues/54
FROM golang:1.13-alpine3.11 as cli_builder
RUN apk update && apk add curl git make
RUN mkdir /srv/aliyun
WORKDIR /srv/aliyun
RUN git clone https://github.com/aliyun/aliyun-cli.git
RUN git clone https://github.com/aliyun/aliyun-openapi-meta.git
ENV GOPROXY=https://goproxy.cn
WORKDIR aliyun-cli
RUN make deps; \
make testdeps; \
make build;
FROM docker:19
#Install python 3 & jq
RUN apk update && apk add python3 py3-pip python3-dev jq
#Install AWS Cli
RUN pip3 install awscli --upgrade
# Install Aliyun CLI from builder
COPY --from=cli_builder /srv/aliyun/aliyun-cli/out/aliyun /usr/bin
RUN aliyun configure set --profile default --mode EcsRamRole --ram-role-name build --region cn-shanghai

docker run error : DPI-1047: Cannot locate a 64-bit Oracle Client library

I am trying to dockerize a very simple python application with Oracle database connection and execute it on Docker. This application is executing fine on my local machine.
I was successfully able to build this application but getting an error while executing it on Docker.
DockerFile:
FROM python:3
ADD File.py /
RUN pip install cx_Oracle
RUN pip install pandas
RUN pip install openpyxl
CMD [ "python", "./File.py" ]
File.py:
import cx_Oracle
import pandas as pd
#creating database connection
dsn_tns = cx_Oracle.makedsn('dev-tr01.com', '1222', service_name='ast041.com')
conn = cx_Oracle.connect(user=r'usr', password='3451', dsn=dsn_tns)
c = conn.cursor()
query ='SELECT * FROM Employee WHERE ROWNUM <10'
result = pd.read_sql(query, con=conn)
result.to_excel("batchtable.xlsx")
conn.close()
Error:
docker run python_batchdriver:latest
cx_Oracle.DatabaseError: DPI-1047: Cannot locate a 64-bit Oracle Client library: "libclntsh.so: cannot open shared object file: No such file or directory". See https://oracle.github.io/odpi/doc/installation.html#linux for help
Update: upgrade to the latest cx_Oracle release (renamed to python-oracledb). This doesn't necessarily need Instant Client, which makes installation a lot easier. See the release announcement.
For cx_Oracle, you need to install Oracle Instant Client libraries too. See the cx_Oracle installation instructions.
There are various ways to automate installation in Docker. One example is:
RUN wget https://download.oracle.com/otn_software/linux/instantclient/instantclient-basiclite-linuxx64.zip && \
unzip instantclient-basiclite-linuxx64.zip && \
rm -f instantclient-basiclite-linuxx64.zip && \
cd instantclient* && \
rm -f *jdbc* *occi* *mysql* *jar uidrvci genezi adrci && \
echo /opt/oracle/instantclient* > /etc/ld.so.conf.d/oracle-instantclient.conf && \
ldconfig
You will also need the libaio or libaio1 package.
See Docker for Oracle Database Applications in Node.js and Python.
Also see Install Oracle Instant client into Docker container for Python cx_Oracle
Note that the steps may be different if you are not using a Debian-based Linux distribution.
Your Debian based docker image needs a oracle instant client. You can download it manually and copy it to your docker image. A dependency "libaio1" is also required.
Add the below line in your docker file.
DockerFile:
FROM python:3
# Installing Oracle instant client
WORKDIR /opt/oracle
# Install dependency
RUN apt-get update && apt-get install -y libaio1
# if extracted file is in current directory named "instantclient_11_2"
# For me instantclient_11_2 for Oracle9i, 10g
# copy it to /opt/oracle/instantclient_11_2
COPY instantclient_11_2 /opt/oracle/instantclient_11_2
# Linking instant client and cleanup
RUN cd /opt/oracle/instantclient* \
&& rm -f *jdbc* *occi* *mysql* *README *jar uidrvci genezi adrci \
&& echo /opt/oracle/instantclient* > /etc/ld.so.conf.d/oracle-instantclient.conf \
&& ldconfig
# Rest is same as yours
ADD File.py /
# To support instantclient_11_2 I used below commented line
# RUN pip install cx-Oracle==7.3.0
RUN pip install cx_Oracle
RUN pip install pandas
RUN pip install openpyxl
CMD [ "python", "./File.py" ]

Python 3 virtualenv and Docker

I'm trying to build a docker image with python 3 and virtualenv.
I understand that I wouldn't need to use wirtualenv in a docker image as I'm going to use only python 3, yet I see some clean isolation benefits of using virtualenv anyways.
What's the best practice? Should I avoid using virtualenv on docker?
If that's the case, how can I setup python3 and pip3 to be used as python and pip (without the 3)?
This is my Dockerfile:
FROM openjdk:8-alpine
RUN apk update && apk add bash gcc musl-dev
RUN apk add python3 python3-dev
RUN apk add py3-pip
RUN apk add libxslt-dev libxml2-dev
ENV PROJECT_HOME /opt/app
RUN mkdir -p /opt/app
RUN mkdir -p /opt/app/modules
ENV LD_LIBRARY_PATH /usr/lib/python3.6/site-packages/jep
ENV LD_PRELOAD /usr/lib/libpython3.6m.so
RUN pip3 install jep
RUN pip3 install ads
RUN pip3 install gspread
RUN pip3 list
COPY target/my-server-1.0-SNAPSHOT.jar $PROJECT_HOME/my-server-1.0-SNAPSHOT.jar
WORKDIR $PROJECT_HOME
CMD ["java", "-Dspring.data.mongodb.uri=mongodb://my-mongo:27017/mydb","-jar","./my-server-1.0-SNAPSHOT.jar"]
Thanks
=== UPDATE 1 ===
I'm trying to create a new virtual env in the WORKDIR, install some libs and then execute a shell script, even though I see it creates the whole thing when I build the image, when running the container the environment folder is empty.
This is from my Dockerfile:
RUN virtualenv ./env && source ./env/bin/activate && pip install jep \
googleads gspread oauth2client
ENTRYPOINT ["/bin/bash", "./startup.sh"]
startup.sh:
#!/bin/sh
source ./env/bin/activate
java -Dspring.data.mongodb.uri=mongodb://my-mongo:27017/mydb -jar ./my-server-1.0-SNAPSHOT.jar
It builds fine but on docker-compose up -d this is the output:
./startup.sh: source: line 2: can't open './env/bin/activate'
The env folder exists, but it's empty.
Any ideas?
Thanks!
=== UPDATE 2 ===
This is the working config:
RUN virtualenv ./my-env && source ./my-env/bin/activate \
&& pip install gspread==0.6.2 jep oauth2client googleads pandas
CMD ["/bin/bash", "-c", "./startup.sh"]
This is startup.sh:
#!/bin/sh
source ./my-env/bin/activate
java -Dspring.data.mongodb.uri=mongodb://my-mongo:27017/mydb -jar ./my-server-1.0-SNAPSHOT.jar
I don't think using virtualenv in docker is something really negative, it will slow down your container builds just a bit.
As for renaming pip3 and python3, you can create a hard link like this:
ln /usr/bin/python3 /usr/bin/python
ln /usr/bin/pip3 /usr/bin/pip
assuming python3 executable is in /usr/bin/. You can find its location by running which python3
P.S.: Your dockerfile contains loads of RUN instructions, that are creating unnecessary intermediate containers. Combine them to save space and time:
RUN apk update && apk add bash gcc musl-dev \
python3 python3-dev py3-pip \
libxslt-dev libxml2-dev
RUN mkdir -p /opt/app/modules # you don't need the first one, -p will create it for you
RUN pip3 install jep ads gspread
Or combine them even further, if you aren't planning to change them often:
RUN apk update
&& apk add bash gcc musl-dev \
python3 python3-dev py3-pip \
libxslt-dev libxml2-dev \
&& mkdir -p /opt/app/modules \
&& pip3 install jep ads gspread
The only "workaround" I've found in order to use virtualenv from my docker container is to enter to the docker by ssh, create the environment, install the libs and set its folder as a volume in the docker-compose config so it won't be deleted and I can use it afterward.
(Or to have it ready and just copy the folder at build time) which could be a good option for saving build time, isn't it?
Otherwise, If I create it on Dockerfile and install the libs there, its folder gets empty when the container runs. Don't know why.
I appreciate if anyone can suggest a better way to deal with that.

Docker pass in arguments to python script that uses argparse

I have the following docker image
FROM ubuntu
RUN apt-get update \
&& apt-get install -y python3 \
&& apt-get install -y python3-pip \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* \
&& pip3 install boto3
ENV INSTALL_PATH /docker-flowcell-restore
RUN mkdir -p $INSTALL_PATH
WORKDIR $INSTALL_PATH
COPY requirements.txt requirements.txt
RUN pip3 install -r requirements.txt
COPY /src/* $INSTALL_PATH/src/
ENTRYPOINT python3 src/main.py
In my python script that the ENTRYPOINT points too I have some parameters I would like to pass in. I used argparse in my python script to construct them. Example would be --key as an arg option. This --key argument will change on each run of the script. How do I pass this argument into my script so that it executes with the correct parameters?
I have tried
docker run my_image_name --key 100
but the argument is not getting to python script.
You can use CMD command to pass parameters (and set defaults ones for an entrypoint), for example:
CMD [ "python", "manage.py", "runserver", "0.0.0.0:8000" ]
Take a look here for details.

Resources