Openshift creating too many processes - python-3.x

I have a python application running using gunicorn. I have wrapped it in a docker image and deployed it on openshift. However, the pod either consumes too much memory or crashes with OOM/out of memory error.
On investing, I found out that there are multiple instances of my app being made even if I haven't specified gunicorn to create multiple workers.
Note: when the same docker image is ran on local machine, it works perfectly fine.

Whose image are you using? If you are using the Python S2I image provided by OpenShift to wrap your application and haven't taken control of WSGI server execution and are letting the OpenShift image configure it, it will set the number of processes based on available resources detected. If your web application is particular memory hungry though and uses more than a typical application, the number of processes it creates may be too much. In this case you can set the WEB_CONCURRENCY concurrency environment variable to override how many process it sets.
See WEB_CONCURRENCY in:
https://github.com/sclorg/s2i-python-container/blob/master/3.6/README.md

Related

Schedule daily Docker container restart / reset

I have a Linux based Docker container running an application which seems to have a memory leak. After around a week requests to the application start to fail and the container requires a restart to reset its state and get things working again.
The error reported by the application is:
java.lang.OutOfMemoryError: Java heap space
Is there a generic method that can be used to trigger a restart, resetting it's state, regardless of which service is being used to host it? If there's not a good generic solution, I'm about to give DigitalOcean a whirl so maybe there's a DigitalOcean specific solution that may work instead?
You can set a restart policy (with flag on-failure) as described here.
Check out the Watchtower project. This is an incredible tool that restarts Docker containers on schedule and also updates containers automatically.

running nodejs app inside go

I have a requirement. Is there a way to run nodejs apps inside golang? I need to wrap the nodejs app inside a golang application and in the end to result a golang binary that starts the nodejs server and then to be able to call nodejs rest endpoints. I need to encapsulate in the golang binary the entire nodejs application with nodem_odules, if necessarily the nodejs runtime.
Well, you could make a Go program that includes e.g. a zipped Node application that it extracts and starts but it will be very hard to do well - you will have huge binaries, delays in extracting files, potential portability problems etc. Usually when you want to call REST endpoints then you host your Node app on some server and you let the client app (the Go app in your example) to connect to that Node app to work correctly. Advantages are that it is much faster, the app is much smaller, you don't have portability issues with Node binaries and addons and you can quickly update your backend any time you want.
It will be a very bad idea to embed a nodejs app into your golang, for various reasons such as: size, security updates pushing, etc.
However, if you so strong feel that they should be together, you could easily create a docker container with these two (a golang server + a node app) and launch them via docker. You can set the entrypoint to a supervisord daemon so that your node server as well as the golang server can be brought up when your container is run.
If you are planning to deploy via kubernetes you can create two individual docker containers (one for the golang server, one for the node server) but deploy them always together as a pod too.
There are multiple projects to embed binary files and/or file system data into your Go application.
Look at 'Alternatives' section of project 'vfsgen':
https://github.com/shurcooL/vfsgen#alternatives

Do I first need docker environment before starting my project?

I am going to work with Node.js and PostgreSQL on Linux. I read many hours about how docker actually works. Still I am not sure that is docker environment needed before starting my project or I can use docker after completion of the project?
Lets first understand what docker is and how you can use it in your project.
Docker have three core concepts:
1) Docker engine : a lightweight runtime and robust tooling that builds and runs your Docker containers.
2) Docker image : a carbon copy of your project environment including all environment dependencies like base operating system, host entries, environment variables, databases, web/application servers. In your case, Linux distribution of your choice, node.js and required modules, PostreSQL and it's configuration.
3) docker container : can be visualized as an virtual Linux server running your project. Each time you use docker run, a new container is launched from the docker image.
You can visualize a docker-environment as an lightweight virtual machine where you can run your project without any external interference(host entries/environment variables/ RAM/ CPU) from other projects.
So as a developer, you can develop your project on your Dev machine and once it's ready to be pushed to QA/Staging you can build a docker image of your project which then can be deployed on any environment(QA/Staging/Production).
You can launch multiple container from your image on single or multiple physical servers.
You can introduce Docker whenever you want. If using multiple servers then you can create a Docker container with one server in it and the other (non-Dockerised solution) makes requests to that.
Or you could Dockerise them both.
Basically, introduce Docker when you feel the time is right.
I like to divide a large project into multiple sections - e.g. front end web sever, backend authentication server, backend API server 1, backend API server 2, etc.
As each part of the project gets completed, I Dockerise it. The other parts then use the Dockerised solution.

Docker container management solution

We've NodeJS applications running inside docker containers. Sometimes, if any process gets locked down or due to any other issue the app goes down and we've to manually login to each container n restart the application. I was wondering
if there is any sort of control panel that allow us to easily and quickly restart those and see the whole health of the system.
Please Note: we can't use --restart flag because essentially application doesn't exist with exist code. It run into problem like some process gets blocked, things are just getting bogged down vs any crashes and exist codes. That's why I don't think restart policy will help in this scenario.
I suggest you consider using the new HEALTHCHECK directive in Docker 1.12 to define a custom check for your locking condition. This feature can be combined with the new Docker swarm service feature to specify how many copies of your container you want to have running.

docker and product versions

I am working for a product company and we do make lot of releases of the product. In the current approach to test multiple releases, we create separate VM and install all infrastructure softwares(db, app server etc) on top of it. Later we deploy the application WARs on the respective VM. Recently, I came across docker and it seems to be much helpful. Hence I started exploring it with the examples listed on the site. But, I am not able to find a way as how docker can be applied to build environment suitable to various releases?
Each product version will have db schema changes.
Each application WARs will have enhancements/defects etc.
Consider below example.
Every month, our company is releasing a new version of software and hence in order to support/fix defects we create VMs per release. Given the fact that if the application's overall size is 2 gb and OS takes close to 5 gb (apart from space it will also take up system resources for extra overhead). The VMs are required to restore any release and test any support issues reported against it. But looking at the additional infrastructure requirements, it seems that its very costly affair.
Can docker have everything required to run an application inside a container/image?
Can docker pack an application which consists of multiple WARs/DB schemas and when started allocate appropriate port?
Will there be any space/memory/speed differences compared to VM and docker assuming above scenario?
Do you think docker is still appropriate solution or should we continue using VMs? Can someone share pointers on how I can achieve above requirements with docker?
tl;dr: Yes, docker can run most applications inside a container.
Docker runs a single process inside each container. When using VMs or real servers, this one process is usually the init system which starts all system services. With docker it is usually your app.
This difference will get you faster startup times for your app (not starting the whole operating system). The trade off is that, if you depend on system services (such as cron, sshd…) you will need to start them yourself. There are some base images that provide a more "VM-like" environment… check phusion's baseimage for instance. To start more than a single process, you can also use a process manager such as supervisord.
Going forward, the recommended (although not required) approach is to start one process in each container (one per application server, one per database server, and so on) and not use containers as VMs.
Docker has no problems allocating ports either. It even has an explicit command on the Dockerfile: EXPOSE. Exposed ports can also be published on the docker host with the --publish argument of run so you don't even need to know the IP assigned to the container.
Regarding used space, you will probably see important savings. Docker images are created by stacking filesystem layers… this means that the common layers are only stored once on the server. In your setup, you will likely only have one copy of the base operating system layer (with VMs, you have a copy on each VM).
On memory you will probably see less significant savings (mostly caused by not starting all the operating system services). Speed is still a subject of research… A few things clear so far is that for faster IO you will need to use docker volumes and that for network heavy use cases you should use host networking. Check the IBM research "An Updated Performance Comparison of Virtual Machines and Linux Containers" for details. Or a summary like InfoQ's.

Resources