I'm trying to build an application with python to scrape and serve data.
All data is stored as sqlite3 database in /app/data folder.
Here's my Dockerfile
FROM python:3.6.0
WORKDIR /app
COPY './requirements.txt' .
RUN mkdir /app/data
RUN mkdir /app/logs
RUN chmod -R 777 /app/data
RUN chmod -R 777 /app/logs
RUN pip install -r requirements.txt
COPY . .
ENTRYPOINT [ "python", "app.py" ]
Azure is taking image source from the private docker hub repository.
At first, the application worked fine but after a few hours image got updated(I didn't change anything) and the container got cleared, which means all my data(database/logs) is gone.
Continuous Deployment is set to Off and I'm not updating the image in docker hub.
How I can prevent container rebuilding?
Is Always On turned on in the App Service settings?
Also, the nature of containers makes them ephemeral so you should never store data that you want to keep inside them. That being said, App Service provides you with an easy way map a volume to the storage included in your App Service. The feature is called Persistent Shared Storage and it maps the WEBAPP_STORAGE_HOME env variable to the App Service's /home folder.
In the Web App's Application Settings You need to set WEBSITES_ENABLE_APP_SERVICE_STORAGE to true and inside your container, you'll now see a /home folder. That folder points to the storage part of your App Service.
Using a Docker Compose file you can also define a volume using that env variable:
${WEBAPP_STORAGE_HOME}/LogFiles:/app/logs
Link to the doc
Related
I've got a python package running in a container.
Is it best practice to install it in /opt/myapp within the container?
Should the logs go in /var/opt/myapp?
Should the config files go in /etc/opt/myapp?
Is anyone recommending writing logs and config files to /opt/myapp/var/log and /opt/myapp/config?
I notice google chrome was installed in /opt/google/chrome on my (host) system, but it didn't place any configs in /etc/opt/...
Is it best practice to install it in /opt/myapp within the container?
I place my apps in my container images in /app. So in the dockerfile I do
WORKDIR /app at the beginning
Should the logs go in /var/opt/myapp?
In container world the best practice is that your application logs go into stdout, stderr and not into files inside the container because containers are ephemeral by design and should be treated that way so when a container is stopped and deleted all of its data on its filesystem is gone.
On local docker development environment you can see the logs with docker logs and you can:
start a container named gettingstarted from the image docker/getting-started:
docker run --name gettingstarted -d -p 80:80 docker/getting-started
redirect docker logs output to a local file on the docker client (your machine from where you run the docker commands):
docker logs -f gettingstarted &> gettingstarted.log &
open http://localhost to generate some logs
read the log file with tail realtime or with any text viewer program:
tail -f gettingstarted.log
Should the config files go in /etc/opt/myapp?
Again, you can put the config files anywhere you want, I like to keep them together with my app so in the /app directory, but you should not modify the config files once the container is running. What you should do is instead pass the config variables to the container as environment variables at startup with the -e flag, for example to create MYVAR variable with MYVALUE value inside the container start it this way:
docker run --name gettingstarted -d -p 80:80 -e MYVAR='MYVALUE' docker/getting-started
exec into the container to see the variable:
docker exec -it gettingstarted sh
/ # echo $MYVAR
MYVALUE
From here it is the responsibility of your containerized app to understand these variables and translate them to actual application configurations. Some/most programming languages support reaching env vars from inside the code at runtime but if this is not an option then you can do an entrypoint.sh script that updates the config files with the values supplied through the env vars. A good example for this is the postgresql entrypoint: https://github.com/docker-library/postgres/blob/master/docker-entrypoint.sh
Is anyone recommending writing logs and config files to
/opt/myapp/var/log and /opt/myapp/config?
As you can see, it is not recommended to write logs into the filesystem of the container you would rather have a solution to save them outside of the container if you need them persisted.
If you understand and follow this mindset especially that containers are ephemeral then it will be much easier for you to transition from the local docker development to production ready kubernetes infrastructures.
Docker is Linux, so almost all of your concerns are related to the best operative system in the world: Linux
Installation folder
This will help you:
Where to install programs on Linux?
Where should I put software I compile myself?
and this: Linux File Hierarchy Structure
As a summary, in Linux you could use any folder for your apps, bearing in mind:
Don't use system folders : /bin /usr/bin /boot /proc /lib
Don't use file system folder: /media / mnt
Don't use /tmp folder because it's content is deleted on each restart
As you researched, you could imitate chrome and use /opt
You could create your own folder like /acme if there are several developers entering to the machine, so you could tell them: "No matter the machine or the application, all the custom content of our company will be in /acme". Also this help you if you are a security paranoid because will be able to guess where your application is. Any way, if the devil has access to your machine, is just a matter of time to find all.
You could use fine grained permissions to keep safe the chosen folder
Log Folder
Similar to the previous paragraph:
You could store your logs the standard /var/log/acme.log
Or create your own company standard
/acme/log/api.log
/acme/webs/web1/app.log
Config Folder
This is the key for devops.
In a traditional, ancient and manually deployments, some folders were used to store the apps configurations like:
/etc
$HOME/.acme/settings.json
But in the modern epoch and if you are using Docker, you should not store manually your settings inside of container or in the host. The best way to have just one build and deploy n times (dev, test, staging, uat, prod, etc) is using environment variables.
One build , n deploys and env variables usage are fundamental for devops and cloud applications, Check the famous https://12factor.net/
III. Config: Store config in the environment
V. Build, release, run: Strictly separate build and run stages
And also is a good practice on any language. Check this Heroku: Configuration and Config Vars
So your python app should not read or expect a file in the filesystem to load its configurations. Maybe for dev, but no for test and prod.
Your python should read its configurations from env variables
import os
print(os.environ['DATABASE_PASSWORD'])
And then inject these values at runtime:
docker run -it -p 8080:80 -e DATABASE_PASSWORD=changeme my_python_app
And in your developer localhost,
export DATABASE_PASSWORD=changeme
python myapp.py
Before the run of your application and in the same shell
Config of a lot pf apps
The previous approach is an option for a couple of apps. But if you are driven to microservices and microfrontends, you will have dozens of apps on several languages. So in this case, to centralize the configurations you could use:
spring cloud
zookeeper
https://www.vaultproject.io/
https://www.doppler.com/
Or the Configurator (I'm the author)
Relatively new to Docker so trying to understand how to accomplish my task.
Locally, I:
Build the image
Push the image to some URL
SSH into Linux VM
docker pull image from URL
docker run image_name
This image, when run, downloads 2 fairly large csv.gz's. When unzipped, the two CSV's are about 15GB each.
I set up /app on the Linux VM to have 200GB available. So, in short, I need to have the Docker image download those 2 CSV's there. However no matter what I've tried within my Dockerfile, I see
'No space left on device' when it gets to the part to download the CSVs.
I've tried to set WORKDIR to /app, but that does not help.
Do I need to use a daemon.json file? Does some sort of Docker setting need to be changed on the Linux VM? Do I need to look into Docker volumes?
Relevant pieces of Dockerfile:
FROM centos/python-36-centos7
USER root
WORKDIR /usr/src/app
COPY . .
As for /usr/src/app, I've never seen anything in there. I normally use /usr/src/app since that's what I use for my Cloud Foundry deployments.
Any insight to point me in the right direction would be appreciated.
Doing the following resolved the issue:
Create (if daemon.json does not exist): /etc/docker/daemon.json
And write
{
“data-root”: “/app”
}
Looks like by default everything goes to /var, and in my case, /var only has 4GB of space. /app is where the 200GB resides.
You will have to restart docker service when creating/saving daemon.json.
Referenced this answer: the one with 88 upvotes
I'm trying to host Jenkins in a Docker container in the Azure App Service. This means it's 'linux' hosting.
By default the jenkins/jenkins-2.110-alpine Docker image stores its data in the /var/jenkins_home folder in the container. I want this data/config persisted to Azure persistent storage so that it's persisted across container restarts.
I've read documentation and blogs stating that you can have container data persisted if it's stored in the /home folder.
So I've customized the Jenkins Dockerfile to look like this...
FROM jenkins/jenkins:2.110-alpine
USER root
RUN mkdir /home/jenkins
RUN ln -s /var/jenkins_home /home/jenkins
USER jenkins
However, when I deploy to Azure App Service I don't see the file in my /home folder (looking in Kudu console). The app starts just fine, but I lose all of my data when I restart my container.
What am I missing?
That's expected because you only persist a symlink (ln -s /var/jenkins_home /home/jenkins) on the Azure host. All the files physically exist inside the container.
To do this, you have to actually change Jenkins configuration to store all data in /home/jenkins which you have already created in your Dockerfile above.
A quick search for Jenkins data folder suggests that you set the environment variable JENKINS_HOME to your directory.
In your Dockerfile:
ENV JENKINS_HOME /home/jenkins
When i create web app on azure and i need to deploy my code - i can just put files to /site/wwwroot folder.
But when i try to build custom docker image i dont understand how to have a same behavior. On the official image - https://github.com/Azure-App-Service/php/blob/master/7.0.6-apache/Dockerfile, i take a look code:
ln -s /home/site/wwwroot /var/www/html
but it is not work for my (i dont understand why).
My Dockerfile looks like this:
FROM php:7.0.6-apache
RUN ln -s /home/site/wwwroot /var/www/html
CMD ["apache2-foreground"]
In your app settings look for this setting: WEBSITES_ENABLE_APP_SERVICE_STORAGE . Make sure this is set to true. By default storage mounting is turned off for "Web App for Containers" (Instances where you bring your own container). This is to prevent app restarts for things like storage failover if you're not using storage. If you include this setting and set it to true it'll mount just like our official images do. We're working on making this more transparent and settable in a way other than through an app setting.
Image of App Setting
I have 2 fairly simple Docker containers, 1 containing a NodeJS application, the other one is just a MongoDB container.
Dockerfile.nodeJS
FROM node:boron
ENV NODE_ENV production
# Create app directory
RUN mkdir -p /node/api-server
WORKDIR /node/api-server
# Install app dependencies
COPY /app-dir/package.json /node/api-server/
RUN npm install
# Bundle app source
COPY /app-dir /node/api-server
EXPOSE 3000
CMD [ "node", "." ]
Dockerfile.mongodb
FROM mongo:3.4.4
# Create database storage directory
VOLUME ["/data/db"]
# Define working directory.
WORKDIR /data
# Define default command.
CMD ["mongod"]
EXPOSE 27017
They both work independently from each other, but when I create 2 separate containers of it, they won't communicate with each other anymore (Why?). Online there are a lot of tutorials about doing it with or without docker-compose. But they all use --link. Which is a deprecated legacy feature of Docker. So I don't want to go that path. What is the way to go in 2017, to make this connection between 2 docker containers?
you can create a specific network
docker create network -d overlay boron_mongo
and then you launch both containers with such a command
docker run --network=boron_mongo...
extract from
https://docs.docker.com/compose/networking/
The preferred way is to use docker-compose
Have a look at
Configuring the default network
https://docs.docker.com/compose/networking/#specifying-custom-networks