I am very new to web development and currently trying to deploy a Flask application on Microsoft Azure. The deployment is successful but the website loads for a very long time and then crashes...
The errors I get in the Application logs are as follows:
2020-04-24 11:05:10.813 INFO - Logging is not enabled for this container.
Please use https://aka.ms/linux-diagnostics to enable logging to see container logs here.
2020-04-24 11:05:14.160 INFO - Initiating warmup request to container <container name> for site <appname>
2020-04-24 11:05:29.738 INFO - Waiting for response to warmup request for container <container name>. Elapsed time = 15.5776184 sec
2020-04-24 11:05:46.605 INFO - Waiting for response to warmup request for container <container name>. Elapsed time = 32.4451731 sec
2020-04-24 11:06:02.843 INFO - Waiting for response to warmup request for container <container name>. Elapsed time = 48.6833158 sec
2020-04-24 11:06:20.077 INFO - Waiting for response to warmup request for container <container name>. Elapsed time = 65.9168483 sec
2020-04-24 11:06:35.755 INFO - Waiting for response to warmup request for container <container name>. Elapsed time = 81.5946978 sec
2020-04-24 11:06:51.273 INFO - Waiting for response to warmup request for container <container name>. Elapsed time = 97.1126494 sec
2020-04-24 11:07:06.740 INFO - Waiting for response to warmup request for container <container name>. Elapsed time = 112.5797427 sec
2020-04-24 11:07:22.295 INFO - Waiting for response to warmup request for container <container name>. Elapsed time = 128.1349342 sec
2020-04-24 11:09:28.785 ERROR - Container <container name> for site <appname> did not start within expected time limit. Elapsed time = 254.6249399 sec
2020-04-24 11:09:28.833 ERROR - Container <container name> didn't respond to HTTP pings on port: 8000, failing site start. See container logs for debugging.
2020-04-24 11:09:29.008 INFO - Stoping site <appname> because it failed during startup.
What am I missing?
In my main.py file I set app.run(debug=True); so no ports and hosts specified, is this ok?
In the Cloud Shell I set:
az webapp config appsettings set --resource-group <resource-group-name> --name <app-name> --settings WEBSITES_PORT=8000
Does anyone know where the problem is?
All the best,
snowe
Hosting your webapp on Azure App Service (Web App) you can only work with port 80 and 443. If you need custom port, you'll need to use Azure Virtual Machine to host your application
Related
I have a Web App running in Azure inside a docker container. However, I am unable to access the application. Here is the log from "Deployment Center":
docker run -d -p 8349:80 --name app-my-web-app_0_bca7f103 -e PORT=80 -e WEBSITES_ENABLE_APP_SERVICE_STORAGE=false -e WEBSITES_PORT=80 -e WEBSITE_SITE_NAME=app-my-web-app -e WEBSITE_AUTH_ENABLED=False -e WEBSITE_ROLE_INSTANCE_ID=0 -e WEBSITE_HOSTNAME=app-my-web-app.azurewebsites.net -e WEBSITE_INSTANCE_ID=0a7710206f1bf4cc0f990b30fe5398a5e19747081c3fc5a2f68d99d5f89c7df1 mywebapp.azurecr.io/mywebapp:latest
2021-09-23T13:19:11.654Z INFO - Logging is not enabled for this container.
Please use https://aka.ms/linux-diagnostics to enable logging to see container logs here.
2021-09-23T13:19:14.954Z INFO - Initiating warmup request to container app-my-web-app_0_bca7f103 for site app-my-web-app
2021-09-23T13:19:30.307Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 15.3528308 sec
2021-09-23T13:19:45.558Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 30.6041298 sec
2021-09-23T13:20:00.710Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 45.756227 sec
2021-09-23T13:20:16.272Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 61.3183851 sec
2021-09-23T13:20:31.463Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 76.508941 sec
2021-09-23T13:20:46.994Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 92.0402202 sec
2021-09-23T13:21:12.911Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 117.9568867 sec
2021-09-23T13:21:28.380Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 133.4261021 sec
2021-09-23T13:21:43.605Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 148.6510226 sec
2021-09-23T13:21:59.418Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 164.4641901 sec
2021-09-23T13:22:14.994Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 180.0399709 sec
2021-09-23T13:22:30.235Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 195.2811287 sec
2021-09-23T13:22:51.871Z INFO - Waiting for response to warmup request for container app-my-web-app_0_bca7f103. Elapsed time = 216.9167296 sec
2021-09-23T13:23:05.306Z ERROR - Container app-my-web-app_0_bca7f103 for site app-my-web-app did not start within expected time limit. Elapsed time = 230.3521527 sec
2021-09-23T13:23:05.392Z ERROR - Container app-my-web-app_0_bca7f103 didn't respond to HTTP pings on port: 80, failing site start. See container logs for debugging.
2021-09-23T13:23:05.442Z INFO - Stopping site app-my-web-app because it failed during startup.
My Dockerfile looks like this
FROM python:3.7-slim
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
EXPOSE 80
RUN mkdir ~/.streamlit
WORKDIR /app
ENTRYPOINT ["streamlit", "run"]
CMD ["my_web_app.py"]
Furthermore, I added the following "Application Settings" under configuration:
Name=PORT, Value=80
Name=WEBSITES_PORT, Value=80
I also tried to increase WEBSITES_CONTAINER_START_TIME_LIMIT to 1800. Nothing works.
Any ideas to how this can be resolved? I have confirmed that the docker container builds and runs locally.
I have been struggling for some days now to deploy an Azure App Service with two minimalist Docker containers.
My first image is a basic PostgreSQL image and my second image is built with the following Dockerfile:
FROM python:3.7
RUN pip install streamlit
COPY app.py /streamlit-docker/
WORKDIR /streamlit-docker/
CMD streamlit run app.py
and where app.py is:
import streamlit as st
st.write('Hello world')
The docker-compose.yml file I use for creating my Azure App service is the following:
version: '3.3'
services:
db:
image: atestcr.azurecr.io/postgres:latest
environment:
POSTGRES_PASSWORD: postgres
app:
image: atestcr.azurecr.io/my_app:latest
ports:
- '8501:8501'
I get an 'Application Error' whenever I try to access the webapp URL. However, when I deploy this app with a single docker container, using this docker-compose.yml:
version: '3.3'
services:
app:
image: atestcr.azurecr.io/my_app:latest
ports:
- '8501:8501'
everything works fine.
Here is a GitHub repo to recreate the bug.
These are the Azure logs:
2021-08-09T17:29:34.382Z INFO - Starting multi-container app..
2021-08-09T17:29:35.089Z INFO - Pulling image: atestcr.azurecr.io/postgres:latest
2021-08-09T17:29:35.313Z INFO - latest Pulling from postgres
2021-08-09T17:29:35.315Z INFO - Digest: sha256:b6df1345afa5990ea32866e5c331eefbf2e30a05f2a715c3a9691a6cb18fa253
2021-08-09T17:29:35.317Z INFO - Status: Image is up to date for atestcr.azurecr.io/postgres:latest
2021-08-09T17:29:35.319Z INFO - Pull Image successful, Time taken: 0 Minutes and 0 Seconds
2021-08-09T17:29:35.335Z INFO - Starting container for site
2021-08-09T17:29:35.335Z INFO - docker run -d -p 8477:5432 --name testapp32_db_0_6662435d -e WEBSITES_ENABLE_APP_SERVICE_STORAGE=false -e WEBSITE_SITE_NAME=testapp32 -e WEBSITE_AUTH_ENABLED=False -e WEBSITE_ROLE_INSTANCE_ID=0 -e WEBSITE_HOSTNAME=testapp32.azurewebsites.net -e WEBSITE_INSTANCE_ID=42c09ff46e6c54cb467d28e88f2ab5b1e8971ee4daf2e883f44401bde67fe89f -e HTTP_LOGGING_ENABLED=1 atestcr.azurecr.io/postgres:latest
2021-08-09T17:29:35.781Z INFO - Pulling image: atestcr.azurecr.io/my_app:latest
2021-08-09T17:29:35.984Z INFO - latest Pulling from my_app
2021-08-09T17:29:35.987Z INFO - Digest: sha256:659937a52a6223b938b3d429901ab8648497870bf8068b5dcc05816050db5eaf
2021-08-09T17:29:35.988Z INFO - Status: Image is up to date for atestcr.azurecr.io/my_app:latest
2021-08-09T17:29:35.993Z INFO - Pull Image successful, Time taken: 0 Minutes and 0 Seconds
2021-08-09T17:29:36.014Z INFO - Starting container for site
2021-08-09T17:29:36.014Z INFO - docker run -d -p 0:8501 --name testapp32_app_0_6662435d -e WEBSITES_ENABLE_APP_SERVICE_STORAGE=false -e WEBSITE_SITE_NAME=testapp32 -e WEBSITE_AUTH_ENABLED=False -e WEBSITE_ROLE_INSTANCE_ID=0 -e WEBSITE_HOSTNAME=testapp32.azurewebsites.net -e WEBSITE_INSTANCE_ID=42c09ff46e6c54cb467d28e88f2ab5b1e8971ee4daf2e883f44401bde67fe89f -e HTTP_LOGGING_ENABLED=1 atestcr.azurecr.io/my_app:latest
2021-08-09T17:33:26.741Z ERROR - multi-container unit was not started successfully
2021-08-09T17:33:26.846Z INFO - Container logs from testapp32_db_0_6662435d = 2021-08-09T17:29:40.459721366Z The files belonging to this database system will be owned by user "postgres".
2021-08-09T17:29:40.491740899Z This user must also own the server process.
2021-08-09T17:29:40.493019808Z
2021-08-09T17:29:40.497263739Z The database cluster will be initialized with locale "en_US.utf8".
2021-08-09T17:29:40.502456876Z The default database encoding has accordingly been set to "UTF8".
2021-08-09T17:29:40.503203182Z The default text search configuration will be set to "english".
2021-08-09T17:29:40.503218482Z
2021-08-09T17:29:40.503223282Z Data page checksums are disabled.
2021-08-09T17:29:40.506809008Z
2021-08-09T17:29:40.521275113Z fixing permissions on existing directory /var/lib/postgresql/data ... ok
2021-08-09T17:29:40.523912632Z creating subdirectories ... ok
2021-08-09T17:29:40.525237242Z selecting dynamic shared memory implementation ... posix
2021-08-09T17:29:40.642905697Z selecting default max_connections ... 100
2021-08-09T17:29:40.733900958Z selecting default shared_buffers ... 128MB
2021-08-09T17:29:41.039050775Z selecting default time zone ... Etc/UTC
2021-08-09T17:29:41.047757638Z creating configuration files ... ok
2021-08-09T17:29:44.416903507Z running bootstrap script ... ok
2021-08-09T17:29:50.023628737Z performing post-bootstrap initialization ... ok
2021-08-09T17:30:04.217544961Z syncing data to disk ... ok
2021-08-09T17:30:04.218321267Z
2021-08-09T17:30:04.219189873Z initdb: warning: enabling "trust" authentication for local connections
2021-08-09T17:30:04.219206473Z You can change this by editing pg_hba.conf or using the option -A, or
2021-08-09T17:30:04.219223973Z --auth-local and --auth-host, the next time you run initdb.
2021-08-09T17:30:04.219491175Z
2021-08-09T17:30:04.219513275Z Success. You can now start the database server using:
2021-08-09T17:30:04.219519575Z
2021-08-09T17:30:04.219523675Z pg_ctl -D /var/lib/postgresql/data -l logfile start
2021-08-09T17:30:04.219527775Z
2021-08-09T17:30:08.340584036Z waiting for server to start.......2021-08-09 17:30:08.340 UTC [44] LOG: starting PostgreSQL 13.3 (Debian 13.3-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
2021-08-09T17:30:08.478247580Z 2021-08-09 17:30:08.410 UTC [44] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-08-09T17:30:08.680599067Z 2021-08-09 17:30:08.680 UTC [45] LOG: database system was shut down at 2021-08-09 17:29:49 UTC
2021-08-09T17:30:08.753589768Z 2021-08-09 17:30:08.753 UTC [44] LOG: database system is ready to accept connections
2021-08-09T17:30:08.755965684Z done
2021-08-09T17:30:08.760382514Z server started
2021-08-09T17:30:09.790821978Z
2021-08-09T17:30:09.799723839Z /usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/*
2021-08-09T17:30:09.802079455Z
2021-08-09T17:30:09.840304817Z 2021-08-09 17:30:09.834 UTC [44] LOG: received fast shutdown request
2021-08-09T17:30:09.862102366Z waiting for server to shut down....2021-08-09 17:30:09.861 UTC [44] LOG: aborting any active transactions
2021-08-09T17:30:09.950488072Z 2021-08-09 17:30:09.950 UTC [44] LOG: background worker "logical replication launcher" (PID 51) exited with exit code 1
2021-08-09T17:30:09.954360498Z 2021-08-09 17:30:09.950 UTC [46] LOG: shutting down
2021-08-09T17:30:13.063848848Z ...2021-08-09 17:30:13.063 UTC [44] LOG: database system is shut down
2021-08-09T17:30:13.138558463Z done
2021-08-09T17:30:13.140082774Z server stopped
2021-08-09T17:30:13.144102302Z
2021-08-09T17:30:13.162430928Z PostgreSQL init process complete; ready for start up.
2021-08-09T17:30:13.165654351Z
2021-08-09T17:30:14.083504086Z 2021-08-09 17:30:13.992 UTC [1] LOG: starting PostgreSQL 13.3 (Debian 13.3-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
2021-08-09T17:30:14.084760295Z 2021-08-09 17:30:14.011 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2021-08-09T17:30:14.084774095Z 2021-08-09 17:30:14.015 UTC [1] LOG: listening on IPv6 address "::", port 5432
2021-08-09T17:30:14.084779495Z 2021-08-09 17:30:14.082 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2021-08-09T17:30:14.156076187Z 2021-08-09 17:30:14.132 UTC [63] LOG: database system was shut down at 2021-08-09 17:30:12 UTC
2021-08-09T17:30:14.198749982Z 2021-08-09 17:30:14.176 UTC [1] LOG: database system is ready to accept connections
2021-08-09T17:30:15.006882460Z 2021-08-09 17:30:14.998 UTC [70] LOG: invalid length of startup packet
2021-08-09T17:30:16.112919592Z 2021-08-09 17:30:16.112 UTC [71] LOG: invalid length of startup packet
2021-08-09T17:30:17.178350344Z 2021-08-09 17:30:17.174 UTC [72] LOG: invalid length of startup packet
...
2021-08-09T17:33:25.735293291Z 2021-08-09 17:33:25.735 UTC [260] LOG: invalid length of startup packet
2021-08-09T17:33:29.142Z INFO - Container logs from testapp32_app_0_6662435d = 2021-08-09T17:30:24.010212694Z
2021-08-09T17:30:24.011085300Z You can now view your Streamlit app in your browser.
2021-08-09T17:30:24.011101800Z
2021-08-09T17:30:24.012092107Z Network URL: http://172.16.232.3:8501
2021-08-09T17:30:24.012855612Z External URL: http://20.40.148.207:8501
2021-08-09T17:30:24.013407416Z
2021-08-09T17:33:36.002Z INFO - Stopping site testapp32 because it failed during startup.
It's worth noting your shared example doesn't work because your docker-compose.yml is using your own private container registries.
However, tinkering around the edges I have managed to get your dummy example working with changes to the following files:
Your app dockerfile:
FROM python:3.7
RUN pip install streamlit
COPY app.py /streamlit-docker/
WORKDIR /streamlit-docker/
EXPOSE 80
CMD streamlit run app.py --server.port 80
And adjusting your docker-compose.yml
version: '3.3'
services:
db:
image: postgis/postgis:13-master
environment:
- POSTGRES_DB=counterfactualcovid
- POSTGRES_USER=django
- POSTGRES_PASSWORD=django
app:
image: testazurecontainerregistryajc.azurecr.io/mcmegaapp:latest
ports:
- '80:80'
Ignore the fact i've used a generic postgres docker image and my own private container registry for the images and you'll see the key changes are:
in the Dockerfile exposing port 80, and adding --server.port 80 to the streamlit CMD call
in the docker-compose.yml setting the port forwarding to 80:80
I /think/ your issues arise from the preview limitations for the multi-container web apps feature, so setting everything to work via port 80 appears to resolve this.
We are running node server on GAE and for some reason a few times a day our server is offline (sometimes it can take a few mins to come back online).
Requests are the same throughout the day and there is also no exception that would be the cause of restart. There is no spike in requests or any special requests that could cause it.
Log when it happens:
2020-04-18T23:48:51.881806Z [GET /v1/util/example [36m304 [35.262 ms - -[ A
2020-04-18T23:50:17.119906Z [start] 2020/04/18 23:50:17.119185 Quitting on terminated signal A
2020-04-18T23:50:17.175632Z [start] 2020/04/18 23:50:17.175267 Start program failed: user application failed with exit code -1 (refer to stdout/stderr logs for more detail): signal: terminated
2020-04-18T23:51:38.772388Z GET 304 173 B 3.3 s Example-V2/3.1.13 (com.example.app; build:1; iOS 13.4.0) Alamofire/5.1.0 /v1/util/example GET 304 173 B 3.3 s Example-V2/3.1.13 (com.example.app; build:1; iOS 13.4.0) Alamofire/5.1.0 5e9b928a00ff0bc9244f94194c0001737e737065616b2d76322d32613166310001737065616b2d6170693a323032303034303374303630343431000100
2020-04-18T23:51:38.786760Z GET 404 324 B 2.4 s Unknown /_ah/start GET 404 324 B 2.4 s Unknown 5e9b928a00ff0c014898f5c27f0001737e737065616b2d76322d32613166310001737065616b2d6170693a323032303034303374303630343431000100
2020-04-18T23:51:39.529080Z [start] 2020/04/18 23:51:39.511828 No entrypoint specified, using default entrypoint: /serve
2020-04-18T23:51:39.529642Z [start] 2020/04/18 23:51:39.528742 Starting app
2020-04-18T23:51:39.529968Z [start] 2020/04/18 23:51:39.529100 Executing: /bin/sh -c exec /serve
2020-04-18T23:51:39.590085Z [start] 2020/04/18 23:51:39.589751 Waiting for network connection open. Subject:"app/invalid" Address:127.0.0.1:8080
2020-04-18T23:51:39.590571Z [start] 2020/04/18 23:51:39.590347 Waiting for network connection open. Subject:"app/valid" Address:127.0.0.1:8081
2020-04-18T23:51:39.764383Z [serve] 2020/04/18 23:51:39.763656 Serve started.
2020-04-18T23:51:39.764935Z [serve] 2020/04/18 23:51:39.764544 Args: {runtimeName:nodejs10 memoryMB:1024 positional:[]}
2020-04-18T23:51:39.766562Z [serve] 2020/04/18 23:51:39.765904 Running /bin/sh -c exec node server.js
2020-04-18T23:51:41.072621Z [start] 2020/04/18 23:51:41.071895 Wait successful. Subject:"app/valid" Address:127.0.0.1:8081 Attempts:296 Elapsed:1.481194491s
2020-04-18T23:51:41.072978Z Express server started on port: 8081
2020-04-18T23:51:41.073008Z [start] 2020/04/18 23:51:41.072411 Starting nginx
2020-04-18T23:51:41.085901Z [start] 2020/04/18 23:51:41.085451 Waiting for network connection open. Subject:"nginx" Address:127.0.0.1:8080
2020-04-18T23:51:41.132064Z [start] 2020/04/18 23:51:41.131572 Wait successful. Subject:"nginx" Address:127.0.0.1:8080 Attempts:9 Elapsed:45.911234ms
2020-04-18T23:51:41.170786Z [GET /_ah/start [33m404 [11.865 ms - 61[
There is always more than 70% memory free, so that could not be the issue. Only noticed very high CPU utilization when it restarts occurs (10x higher than normally).
In the bottom picture you can clearly see when the restarts happen:
This is my app.yaml
runtime: nodejs10
instance_class: B4
service: example-api
basic_scaling:
max_instances: 1
idle_timeout: 30m
handlers:
- url: .*
secure: always
script: auto
This is happening on our production server, so any help would be more than welcome.
Thanks!
Reading this document, it is mentioned that even though they try to keep basic and manual scaling instances running indefinitely, they are sometimes restarted for maintenance or they might fail due to some other reasons. That is why keeping your max instances as 1 is not considered best practice as it is prone to all of these failures. As mentioned in the other answer, I would also recommend to increase the number of instances so the likelyhood of more failing or being restarted at the same time is lower.
We had the same problem when we migrated our Ruby on Rails app to Google App Engine Standard a year ago. After emailing back and forth with Google Cloud Support, they suggested: "increasing the minimum number of instances will help because you will have more “backup” instances."
At the time we had two instances, and since we upped it the three instances, we have had no downtime related to unexpected server restarts.
We are still not sure why our servers are sometimes deemed unhealthy and restarted by App Engine, but having more instances can help you to avoid downtime in the short run while you investigate the underlying issue.
Using Azure Web Apps for Containers to spin up a docker-compose configuration. The primary proxy is Traefik which needs access to the docker API. The container start up correctly. But the traefik container can not access the docker API socket file.
The docker-compose settings for the volume of the traefik container is:
services:
traefik:
image: "traefik:v2.0"
[...]
volumes:
- "/var/run/docker.sock:/var/run/docker.sock:ro"
[...]
The containers start up successfully with the following log:
2019-09-16 10:40:35.040 INFO - Pulling image from Docker hub:
library/traefik:v2.0 2019-09-16 10:40:35.589 INFO - v2.0 Pulling from
library/traefik 2019-09-16 10:40:35.810 INFO - Digest:
sha256:97c6da99b265de1c50e866ff66f927abb84659dcb7916c33e17623fc6969551c
2019-09-16 10:40:35.816 INFO - Status: Image is up to date for
traefik:v2.0 2019-09-16 10:40:35.835 INFO - Pull Image successful,
Time taken: 0 Minutes and 0 Seconds 2019-09-16 10:40:35.871 INFO -
Starting container for site 2019-09-16 10:40:35.872 INFO - docker run
-d -p 40317:80 --name containertest3_traefik_1_6d1c6629 -e WEBSITES_ENABLE_APP_SERVICE_STORAGE=false -e
WEBSITE_SITE_NAME=containertest3 -e WEBSITE_AUTH_ENABLED=False -e
WEBSITE_ROLE_INSTANCE_ID=0 -e
WEBSITE_HOSTNAME=containertest3.azurewebsites.net -e
WEBSITE_INSTANCE_ID=902eae0c51eb407ec9308de2a1c3fce2b35f53f6d148e328560acba2560930f0
-e HTTP_LOGGING_ENABLED=1 traefik:v2.0 --api.insecure=true --providers.docker=true --providers.docker.exposedbydefault=false --entrypoints.web.address=:80 --entryPoints.web.forwardedHeaders.insecure
2019-09-16 10:40:36.215 INFO - Pulling image from Docker hub:
containous/whoami 2019-09-16 10:40:36.779 INFO - latest Pulling from
containous/whoami 2019-09-16 10:40:36.780 INFO - Digest:
sha256:09229ae40edb92e95b15e90fef46bfadab14fd1ec2232aca717a501741fcf391
2019-09-16 10:40:36.788 INFO - Status: Image is up to date for
containous/whoami:latest 2019-09-16 10:40:36.790 INFO - Pull Image
successful, Time taken: 0 Minutes and 0 Seconds 2019-09-16
10:40:36.815 INFO - Starting container for site 2019-09-16
10:40:36.816 INFO - docker run -d -p 0:80 --name
containertest3_whoami_1_6d1c6629 -e
WEBSITES_ENABLE_APP_SERVICE_STORAGE=false -e
WEBSITE_SITE_NAME=containertest3 -e WEBSITE_AUTH_ENABLED=False -e
WEBSITE_ROLE_INSTANCE_ID=0 -e
WEBSITE_HOSTNAME=containertest3.azurewebsites.net -e
WEBSITE_INSTANCE_ID=902eae0c51eb407ec9308de2a1c3fce2b35f53f6d148e328560acba2560930f0
-e HTTP_LOGGING_ENABLED=1 containous/whoami
2019-09-16 10:40:47.048 INFO - Started multi-container app 2019-09-16
10:40:47.099 INFO - Initiating warmup request to container
containertest3_traefik_1_6d1c6629 for site containertest3 2019-09-16
10:40:47.101 INFO - Container containertest3_traefik_1_6d1c6629 for
site containertest3 initialized successfully and is ready to serve
requests.
However, in the error logs, Traefik can not acces the docker api and therefore does not work correctly:
2019-09-16T10:40:47.505909459Z time="2019-09-16T10:40:47Z" level=error
msg="Provider connection error Cannot connect to the Docker daemon at
unix:///var/run/docker.sock. Is the docker daemon running?, retrying
in 1.080381816s" providerName=docker
2019-09-16T10:40:48.585335458Z time="2019-09-16T10:40:48Z" level=error
msg="Failed to retrieve information of the docker client and server
host: Cannot connect to the Docker daemon at
unix:///var/run/docker.sock. Is the docker daemon running?"
Final error for the Traefik container:
"Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"
Any idea how to fix this error in the Azure app environment? Is this not possible within the Azure hosting/preview for App containers?
The Azure Web App service does not allow for reading from docker.sock.
You need to use another configuration provider in traefik if you want to run multi-container in Azure App Service.
Followed steps from the link to create a K8s cluster using the Azure Portal. Tried using kubectl on a remote machine to check if it's working. Got this error.
Unable to connect to the server: dial tcp 13.90.35.157:443: connectex:
A connection attempt failed because the connected party did not
properly respond after a period of time, or established connection
failed because connected host has failed to respond.
I can SSH to the K8s master. Tried kubectl get nodes from the master and got similar error.
It is really hard to say from such a description what went wrong, but as this is a new cluster ( and I'm saying this because sometimes k8s cluster gets deployed but doesn't really work, so ), I would suggest deleting it and creating a new one and\or creating it using the Azure Cli\Azure Cloud Shell.
Basically its as simple as:
az acs create -n acs-cluster -g acsrg1 -d applink789 --generate-ssh-keys
if you have the resource group created, if not you can create it with:
az group create -n acsrg1 -l "westus"
According to your description, it seems you have not configured the Service Principal correctly. I use wrong service principal to deploy K8S in Azure, get the same error:
C:\Users>kubectl get nodes
Unable to connect to the server: dial tcp 13.90.27.73:443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
You may need to check to ensure the credentials were provided accurately, and that the configured Service Principal has read and write permissions to the target Subscription.
If your Service Principal is misconfigured, none of the kubernetes components will come up in a healthy manner. We can check to see if this the problem:
root#k8s-master-6FEE48E1-0:~# journalctl -u kubelet | grep --text autorest
If you see output that looks like the following, it means you have not configured the service Principal correctly.
root#k8s-master-6FEE48E1-0:~# journalctl -u kubelet | grep --text autorest
Jun 01 01:58:47 k8s-master-6FEE48E1-0 docker[5522]: E0601 01:58:47.447321 6028 kubelet.go:1186] Cannot get Node info: failed to get external ID from cloud provider: autorest#WithErrorUnlessStatusCode: POST https://login.microsoftonline.com/1fcf418e-66ed-4c99-9449-d8e18bf8737a/oauth2/token?api-version=1.0 failed with 400 Bad Request: StatusCode=400
Jun 01 01:58:47 k8s-master-6FEE48E1-0 docker[5522]: E0601 01:58:47.627128 6028 kubelet_node_status.go:70] Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: autorest#WithErrorUnlessStatusCode: POST https://login.microsoftonline.com/1fcf418e-66ed-4c99-9449-d8e18bf8737a/oauth2/token?api-version=1.0 failed with 400 Bad Request: StatusCode=400
Jun 01 01:58:47 k8s-master-6FEE48E1-0 docker[5522]: E0601 01:58:47.885092 6028 kubelet_node_status.go:70] Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: autorest#WithErrorUnlessStatusCode: POST https://login.microsoftonline.com/1fcf418e-66ed-4c99-9449-d8e18bf8737a/oauth2/token?api-version=1.0 failed with 400 Bad Request: StatusCode=400
More information about how to create /configure a service principal for ACS-Engin Kubernetes cluster, please refer to this link.