App engine ignores symlinks to directories

App engine ignores symlinks to directories - linux

I'm creating an app which runs on Google's App Engine with the custom flex environment. This app uses several (relative) symlinks which point to other directories in the project. But somehow those symlinks are ignored when I deploy the app.
It seems that the gcloud tool sends the source context (which is, all the files in my project) to the google container builder before building and deploying the app:
$ gcloud --project=my-project --verbosity=info app deploy
(...)
Beginning deployment of service [default]...
Building and pushing image for service [default]
INFO: Uploading [/tmp/tmpZ4Jha_/src.tgz] to [eu.gcr.io/my-project/appengine/default.20171212t160803:latest]
Started cloud build [some-uid].
If I extract the contents of the .tgz file I can see that all the files and directories in the project are there. Except for symlinks pointing to directories (symlinks to files are included though). So the source context is missing all the symlinks to directories.
Not using symlinks is not an option, so does anybody know how to include symlinks to directories in the source context send to google?
Although I don't think it's relevant, here are the contents of the app.yaml:
env: flex
runtime: custom
runtime_config:
document_root: docroot
manual_scaling:
instances: 1
resources:
cpu: 2
memory_gb: 2
disk_size_gb: 10

I've worked around this by deploying my python cloud functions from a temp directory, and using tar (on a Mac) to include files inside symlinked directories:
tar hc --exclude='__pycache__' {name} | tar x -C {tmpdirname}

I use a workaround solution similar to Steve Alexander's, but in a more elaborate way: I have a shell script that creates a temp dir, copies the dependencies into in, sets the environment and runs the gcloud command. It is basically something like this:
. .env.sh
SRC_FILE=$1
SRC_FUNC=$2
TRIGGER_RESOURCE=$3
TRIGGER_EVENT=$4
TMP_DIR=./tmp/deploy
mkdir -p $TMP_DIR
cp -r modules/dep1 $TMP_DIR
cp -r modules/dep2 $TMP_DIR
cp requirements.txt $TMP_DIR
cp $SRC_FILE $TMP_DIR/main.py
gcloud functions deploy $SRC_FUNC \
--source=$TMP_DIR \
--runtime=python39 \
--trigger-resource $TRIGGER_RESOURCE \
--trigger-event $TRIGGER_EVENT \
--env-vars-file=./.env.yml \
--timeout 540s
rm -rf $TMP_DIR
This script is tailored for a Google Storage event, ie. to deploy a function that should be triggered when a new file is uploaded to a bucket:
./deploy.func.sh functions.py gs_new_file_event project-bucket1 google.storage.object.finalize
So in the example above gs_new_file_event is a Python function defined in functions.py. The script copies the file with the Python code to the temp dir as main.py which is what the function deployer expects. This works well for a project where there are multiple cloud functions defined in the same repository that also contains dependencies and it is not possible to have all of the apps and functions defined in the top-level main.py. The script removes the temp dir after it is done, but it is a good idea to add the path to .gitingnore.
Here are a few things you can do to adapt the script to your own needs:
Set up the env files with all the required variables: .env.sh for the build and deployment, .env.yml for the function/app runtime.
Fix the paths and dependencies.
Improve the handling of the command line arguments to make it more flexible and work for all kinds of GCloud triggers.

Related

Docker compose with shared files [duplicate]

How can I include files from outside of Docker's build context using the "ADD" command in the Docker file?
From the Docker documentation:
The path must be inside the context of the build; you cannot ADD
../something/something, because the first step of a docker build is to
send the context directory (and subdirectories) to the docker daemon.
I do not want to restructure my whole project just to accommodate Docker in this matter. I want to keep all my Docker files in the same sub-directory.
Also, it appears Docker does not yet (and may not ever) support symlinks: Dockerfile ADD command does not follow symlinks on host #1676.
The only other thing I can think of is to include a pre-build step to copy the files into the Docker build context (and configure my version control to ignore those files). Is there a better workaround for than that?

The best way to work around this is to specify the Dockerfile independently of the build context, using -f.
For instance, this command will give the ADD command access to anything in your current directory.
docker build -f docker-files/Dockerfile .
Update: Docker now allows having the Dockerfile outside the build context (fixed in 18.03.0-ce). So you can also do something like
docker build -f ../Dockerfile .

I often find myself utilizing the --build-arg option for this purpose. For example after putting the following in the Dockerfile:
ARG SSH_KEY
RUN echo "$SSH_KEY" > /root/.ssh/id_rsa
You can just do:
docker build -t some-app --build-arg SSH_KEY="$(cat ~/file/outside/build/context/id_rsa)" .
But note the following warning from the Docker documentation:
Warning: It is not recommended to use build-time variables for passing secrets like github keys, user credentials etc. Build-time variable values are visible to any user of the image with the docker history command.

I spent a good time trying to figure out a good pattern and how to better explain what's going on with this feature support. I realized that the best way to explain it was as follows...
Dockerfile: Will only see files under its own relative path
Context: a place in "space" where the files you want to share and your Dockerfile will be copied to
So, with that said, here's an example of the Dockerfile that needs to reuse a file called start.sh
Dockerfile
It will always load from its relative path, having the current directory of itself as the local reference to the paths you specify.
COPY start.sh /runtime/start.sh
Files
Considering this idea, we can think of having multiple copies for the Dockerfiles building specific things, but they all need access to the start.sh.
./all-services/
/start.sh
/service-X/Dockerfile
/service-Y/Dockerfile
/service-Z/Dockerfile
./docker-compose.yaml
Considering this structure and the files above, here's a docker-compose.yml
docker-compose.yaml
In this example, your shared context directory is the runtime directory.
Same mental model here, think that all the files under this directory are moved over to the so-called context.
Similarly, just specify the Dockerfile that you want to copy to that same directory. You can specify that using dockerfile.
The directory where your main content is located is the actual context to be set.
The docker-compose.yml is as follows
version: "3.3"
services:
service-A
build:
context: ./all-service
dockerfile: ./service-A/Dockerfile
service-B
build:
context: ./all-service
dockerfile: ./service-B/Dockerfile
service-C
build:
context: ./all-service
dockerfile: ./service-C/Dockerfile
all-service is set as the context, the shared file start.sh is copied there as well the Dockerfile specified by each dockerfile.
Each gets to be built their own way, sharing the start file!

On Linux you can mount other directories instead of symlinking them
mount --bind olddir newdir
See https://superuser.com/questions/842642 for more details.
I don't know if something similar is available for other OSes.
I also tried using Samba to share a folder and remount it into the Docker context which worked as well.

If you read the discussion in the issue 2745 not only docker may never support symlinks they may never support adding files outside your context. Seems to be a design philosophy that files that go into docker build should explicitly be part of its context or be from a URL where it is presumably deployed too with a fixed version so that the build is repeatable with well known URLs or files shipped with the docker container.
I prefer to build from a version controlled source - ie docker build
-t stuff http://my.git.org/repo - otherwise I'm building from some random place with random files.
fundamentally, no.... -- SvenDowideit, Docker Inc
Just my opinion but I think you should restructure to separate out the code and docker repositories. That way the containers can be generic and pull in any version of the code at run time rather than build time.
Alternatively, use docker as your fundamental code deployment artifact and then you put the dockerfile in the root of the code repository. if you go this route probably makes sense to have a parent docker container for more general system level details and a child container for setup specific to your code.

I believe the simpler workaround would be to change the 'context' itself.
So, for example, instead of giving:
docker build -t hello-demo-app .
which sets the current directory as the context, let's say you wanted the parent directory as the context, just use:
docker build -t hello-demo-app ..

You can also create a tarball of what the image needs first and use that as your context.
https://docs.docker.com/engine/reference/commandline/build/#/tarball-contexts

This behavior is given by the context directory that the docker or podman uses to present the files to the build process.
A nice trick here is by changing the context dir during the building instruction to the full path of the directory, that you want to expose to the daemon.
e.g:
docker build -t imageName:tag -f /path/to/the/Dockerfile /mysrc/path
using /mysrc/path instead of .(current directory), you'll be using that directory as a context, so any files under it can be seen by the build process.
This example you'll be exposing the entire /mysrc/path tree to the docker daemon.
When using this with docker the user ID who triggered the build must have recursively read permissions to any single directory or file from the context dir.
This can be useful in cases where you have the /home/user/myCoolProject/Dockerfile but want to bring to this container build context, files that aren't in the same directory.
Here is an example of building using context dir, but this time using podman instead of docker.
Lets take as example, having inside your Dockerfile a COPY or ADDinstruction which is copying files from a directory outside of your project, like:
FROM myImage:tag
...
...
COPY /opt/externalFile ./
ADD /home/user/AnotherProject/anotherExternalFile ./
...
In order to build this, with a container file located in the /home/user/myCoolProject/Dockerfile, just do something like:
cd /home/user/myCoolProject
podman build -t imageName:tag -f Dockefile /
Some known use cases to change the context dir, is when using a container as a toolchain for building your souce code.
e.g:
podman build --platform linux/s390x -t myimage:mytag -f ./Dockerfile /tmp/mysrc
or it can be a path relative, like:
podman build --platform linux/s390x -t myimage:mytag -f ./Dockerfile ../../
Another example using this time a global path:
FROM myImage:tag
...
...
COPY externalFile ./
ADD AnotherProject ./
...
Notice that now the full global path for the COPY and ADD is omitted in the Dockerfile command layers.
In this case the contex dir must be relative to where the files are, if both externalFile and AnotherProject are in /opt directory then the context dir for building it must be:
podman build -t imageName:tag -f ./Dockerfile /opt
Note when using COPY or ADD with context dir in docker:
The docker daemon will try to "stream" all the files visible on the context dir tree to the daemon, which can slowdown the build. And requires the user to have recursively permission from the context dir.
This behavior can be more costly specially when using the build through the API. However,with podman the build happens instantaneously, without needing recursively permissions, that's because podman does not enumerate the entire context dir, and doesn't use a client/server architecture as well.
The build for such cases can be way more interesting to use podman instead of docker, when you face such issues using a different context dir.
Some references:
https://docs.docker.com/engine/reference/commandline/build/
https://docs.podman.io/en/latest/markdown/podman-build.1.html

As is described in this GitHub issue the build actually happens in /tmp/docker-12345, so a relative path like ../relative-add/some-file is relative to /tmp/docker-12345. It would thus search for /tmp/relative-add/some-file, which is also shown in the error message.*
It is not allowed to include files from outside the build directory, so this results in the "Forbidden path" message."

Using docker-compose, I accomplished this by creating a service that mounts the volumes that I need and committing the image of the container. Then, in the subsequent service, I rely on the previously committed image, which has all of the data stored at mounted locations. You will then have have to copy these files to their ultimate destination, as host mounted directories do not get committed when running a docker commit command
You don't have to use docker-compose to accomplish this, but it makes life a bit easier
# docker-compose.yml
version: '3'
services:
stage:
image: alpine
volumes:
- /host/machine/path:/tmp/container/path
command: bash -c "cp -r /tmp/container/path /final/container/path"
setup:
image: stage
# setup.sh
# Start "stage" service
docker-compose up stage
# Commit changes to an image named "stage"
docker commit $(docker-compose ps -q stage) stage
# Start setup service off of stage image
docker-compose up setup

Create a wrapper docker build shell script that grabs the file then calls docker build then removes the file.
a simple solution not mentioned anywhere here from my quick skim:
have a wrapper script called docker_build.sh
have it create tarballs, copy large files to the current working directory
call docker build
clean up the tarballs, large files, etc
this solution is good because (1.) it doesn't have the security hole from copying in your SSH private key (2.) another solution uses sudo bind so that has another security hole there because it requires root permission to do bind.

I think as of earlier this year a feature was added in buildx to do just this.
If you have dockerfile 1.4+ and buildx 0.8+ you can do something like this
docker buildx build --build-context othersource= ../something/something .
Then in your docker file you can use the from command to add the context
ADD –from=othersource . /stuff
See this related post https://www.docker.com/blog/dockerfiles-now-support-multiple-build-contexts/

Workaround with links:
ln path/to/file/outside/context/file_to_copy ./file_to_copy
On Dockerfile, simply:
COPY file_to_copy /path/to/file

I was personally confused by some answers, so decided to explain it simply.
You should pass the context, you have specified in Dockerfile, to docker when
want to create image.
I always select root of project as the context in Dockerfile.
so for example if you use COPY command like COPY . .
first dot(.) is the context and second dot(.) is container working directory
Assuming the context is project root, dot(.) , and code structure is like this
sample-project/
docker/
Dockerfile
If you want to build image
and your path (the path you run the docker build command) is /full-path/sample-project/,
you should do this
docker build -f docker/Dockerfile .
and if your path is /full-path/sample-project/docker/,
you should do this
docker build -f Dockerfile ../

An easy workaround might be to simply mount the volume (using the -v or --mount flag) to the container when you run it and access the files that way.
example:
docker run -v /path/to/file/on/host:/desired/path/to/file/in/container/ image_name
for more see: https://docs.docker.com/storage/volumes/

I had this same issue with a project and some data files that I wasn't able to move inside the repo context for HIPAA reasons. I ended up using 2 Dockerfiles. One builds the main application without the stuff I needed outside the container and publishes that to internal repo. Then a second dockerfile pulls that image and adds the data and creates a new image which is then deployed and never stored anywhere. Not ideal, but it worked for my purposes of keeping sensitive information out of the repo.

In my case, my Dockerfile is written like a template containing placeholders which I'm replacing with real value using my configuration file.
So I couldn't specify this file directly but pipe it into the docker build like this:
sed "s/%email_address%/$EMAIL_ADDRESS/;" ./Dockerfile | docker build -t katzda/bookings:latest . -f -;
But because of the pipe, the COPY command didn't work. But the above way solves it by -f - (explicitly saying file not provided). Doing only - without the -f flag, the context AND the Dockerfile are not provided which is a caveat.

How to share typescript code between two Dockerfiles
I had this same problem, but for sharing files between two typescript projects. Some of the other answers didn't work for me because I needed to preserve the relative import paths between the shared code. I solved it by organizing my code like this:
api/
Dockerfile
src/
models/
index.ts
frontend/
Dockerfile
src/
models/
index.ts
shared/
model1.ts
model2.ts
index.ts
.dockerignore
Note: After extracting the shared code into that top folder, I avoided needing to update the import paths because I updated api/models/index.ts and frontend/models/index.ts to export from shared: (eg export * from '../../../shared)
Since the build context is now one directory higher, I had to make a few additional changes:
Update the build command to use the new context:
docker build -f Dockerfile .. (two dots instead of one)
Use a single .dockerignore at the top level to exclude all node_modules. (eg **/node_modules/**)
Prefix the Dockerfile COPY commands with api/ or frontend/
Copy shared (in addition to api/src or frontend/src)
WORKDIR /usr/src/app
COPY api/package*.json ./ <---- Prefix with api/
RUN npm ci
COPY api/src api/ts*.json ./ <---- Prefix with api/
COPY shared usr/src/shared <---- ADDED
RUN npm run build
This was the easiest way I could send everything to docker, while preserving the relative import paths in both projects. The tricky (annoying) part was all the changes/consequences caused by the build context being up one directory.

One quick and dirty way is to set the build context up as many levels as you need - but this can have consequences.
If you're working in a microservices architecture that looks like this:
./Code/Repo1
./Code/Repo2
...
You can set the build context to the parent Code directory and then access everything, but it turns out that with a large number of repositories, this can result in the build taking a long time.
An example situation could be that another team maintains a database schema in Repo1 and your team's code in Repo2 depends on this. You want to dockerise this dependency with some of your own seed data without worrying about schema changes or polluting the other team's repository (depending on what the changes are you may still have to change your seed data scripts of course)
The second approach is hacky but gets around the issue of long builds:
Create a sh (or ps1) script in ./Code/Repo2 to copy the files you need and invoke the docker commands you want, for example:
#!/bin/bash
rm -r ./db/schema
mkdir ./db/schema
cp -r ../Repo1/db/schema ./db/schema
docker-compose -f docker-compose.yml down
docker container prune -f
docker-compose -f docker-compose.yml up --build
In the docker-compose file, simply set the context as Repo2 root and use the content of the ./db/schema directory in your dockerfile without worrying about the path.
Bear in mind that you will run the risk of accidentally committing this directory to source control, but scripting cleanup actions should be easy enough.

Nextjs App reading configuration from Azure App Service

We have a nextjs project which is build by docker and deploy into Azure App Service (container). We also setup configuration values within App Service and try to access it, however its not working as expected.
Few things we tried
Restarting the App Service after adding new configuration
removing .env file while building the docker image
including .env file while building the docker image
Here's how we read try to read the environment variables within the App Service
const env = process.env.NEXT_PUBLIC_ENV;
const A = process.env.NEXT_PUBLIC_AS_VALUE;
Wondering if this can actually be done?
Just thinking something out loud below,
Since we're deploying the docker image within App Service's Container (Linux).. does that mean, the container can't pull the value from this environment variable?
Docker image already perform the npm run build, would that means the image is in static formed (build time). It will never ready from App Service configuration (runtime).

After a day or 2, I came up with an alternative solution by passing the environment values in Dockerfile while building my project.
TLDR
Pass your env values within dockerfile
Set all your env (dev, staging, prod, etc) var values in Pipeline variable.
Set a "settable" variable inside the Pipeline variable too, so you can set to build different environment while triggering your pipeline (eg, buildEnv)
Setup a bash script to perform variable text changing (eg, from firebaseApiKey to DEVfirebaseApiKey ) according to env received from buildEnv.
Use "replace token" task from Azure Pipeline to replace values inside Dockerfile
Build your docker image
Huaala~ now you get your environment specific build
Details
Within your Dockerfile you can place your env variable like this
RUN NEXT_PUBLIC_ENV=#{env}# \
NEXT_PUBLIC_FIREBASE_API_KEY=#{firebaseApiKey}# \
NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=#{firebaseAuthDomain}# \
NEXT_PUBLIC_FIREBASE_PROJECT_ID=#{firebaseProjectId}# \
NEXT_PUBLIC_FIREBASE_STORAGE_BUCKET=#{firebaseStorageBucket}# \
NEXT_PUBLIC_FIREBASE_MESSAGING_SENDER_ID=#{firebaseMessagingSenderId}# \
NEXT_PUBLIC_FIREBASE_APP_ID=#{firebaseAppId}# \
NEXT_PUBLIC_FIREBASE_MEASUREMENT_ID=#{firebaseMeasurementId}# \
NEXT_PUBLIC_BASE_URL=#{baseURL}# \
npm run build
These values set (eg, baseURL, firebaseMeasurementId, etc) are only a placeholder, because we need to change them later with bash script according to the buildEnv we receive. (buildEnv is settable when you trigger a build)
Bash script sample as below. What it does is that it will look within your Dockerfile for the word env and change to DEVenv / UATenv / PRODenv based on what you're passing to buildEnv
#!/bin/bash
case $(buildENV) in
dev)
sed -i -e 's/env/DEVenv/g' ./Dockerfile
;;
uat)
sed -i -e 's/env/UATenv/g' ./Dockerfile
;;
prod)
sed -i -e 's/env/PROenvD/g' ./Dockerfile
;;
*)
echo -n "unknown"
;;
esac
When this is complete, your "environment specific" docker file is sort of created. Now we'll make use of the "replace token" task from Azure Pipeline to replace the values inside Dockerfile. **Make sure you have all your values setup in Pipeline Variable!
Lastly all you may build your docker image and deploy :)

reading .env file from node - env file is not published

I am trying to read .env file using "dotenv" package but it returns undefined from process.env.DB_HOST after published to gcloud run. I see all files except for the .env file in root directory when I output all files to log. I do have .env file in my project on a root directory. Not sure why it's not getting pushed to gcloud or is it?. I do get a value when I tested locally for process.env.DB_HOST.
I used this command to publish to google run.
gcloud builds submit --tag gcr.io/my-project/test-api:1.0.0 .

If you haven't a .gcloudignore file in your project, gcloud CLI use the .gitignore by default
Create a .gcloudignore and put the file that you don't want to upload when you use gcloud CLI command. So, don't put the .env in it!
EDIT 1
When you add a .gcloudignore, the gcloud CLI no longer read the .gitignore file and use it instead.
Therefore, you can define 2 different logics
.gitignore list the file that you don't want to push to the repository. Put the .env file in it to NOT commit it
.gcloudignore list the file that you don't want to send with the gcloud CLI. DON'T put the .env file in it to include it when you send your code with the gcloud CLI

gcloud app deploy failed with error - gcloud crashed FileNotFoundError - python3 app

I am trying to deploy a sample python app which I got from another tutorial. However, the deployment fails as below:
gcloud app deploy Beginning deployment of service [default]... ERROR:
gcloud crashed (FileNotFoundError): [Errno 2] No such file or
directory:
'/Users/nileshdeshmukh/Desktop/Training/Python/FlaskIntroduction-master/env/.Python'
My app.yaml file is as below:
runtime: python3
env: standard
runtime_config:
python_version: 3
I have all dependencies copied in env/bin but the build process is looking for env only..
I think the problem would be solved if the deployment process looks at env/bin, but don't know how to force it to look at given path

The runtime_config setting is for App Engine flex only and isn't needed for App Engine Standard. You can safely remove it.
As per the error, you should ensure that all your dependencies are self-contained and shipped with your app or listed in your requirements.txt file.

Be careful, some gcloud commands use .gitignore file to prevent sending useless file to Cloud for building your app.
You can override this behavior by creating a .gcloudignore file. Same syntax as git ignore but take into account only by gcloud commands and not by git. By the way you can differentiate the file to send to the cloud and file to send to git

Gitlab CI Web Deployment

So we are currently moving away from our current deployment provider: Beanstalk, which is great but we are on the top tier and we keep running out of space or hitting our repository limits. So we are moving away so please do not suggest any other SaaS provider.
I personally use Gitlab for my own projects and a few company projects and it's amazing we use a self hosted version on our local server in our company building.
We have CI setup and currently are using the following deployment code (I have minified the bits just to the deployment for development) - this uses the shell executer for deploying as we deploy to an existing linux server.
variables:
HOSTNAME: '<hostname>'
USERNAME: '<username>'
PASSWORD: '<password>'
PATH_DEV: '/path/to/www'
# Define the stages (we can add as many as we want)
stages:
# - build
- deploy
# The code for development deployment
deploy_dev:
stage: deploy
script:
- echo "Deploying to development environment..."
- rm .gitlab-ci.yml
- rsync -urltvz --filter=':- .gitignore' --exclude=".git" -e "sshpass -p"$PASSWORD" ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" * $USERNAME#$HOSTNAME:$PATH_DEV
- echo "Finished deploying."
environment:
name: Development
url: http://dev.domain.com
only:
- envdev
The Problem:
When we use the above code to deploy it's perfect and works really well, and it deploys all the code after optimisation etc, but we have found a little bug here.
When you delete a file then the rsync command will not delete the file, now I did some searching and found the --remove flag you can add, and it worked - but it deleted all the user uploaded content as well. Now I added the .gitignore in to the filtering, so it would ignore some the files in their (which are usually user generated) or configuration files or/and libraries (npm, etc.). This is fine until a user started uploading files using the media manager in our framework which stores in a folder that is not in the .gitignore file and it can't because it contains other files, as we also add our own files in there so they're editable by the user, so now I am unsure how to manage this.
What we are looking for is a CI setup, which will upload file changes to the server, so it would search through the latest commits, and find the latest files that have been changed and then push only them files up. Of course I would like to do this with the Gitlab CI still, so any ideas examples or tutorials would be amazing.
Thanks in advance.
~ Danny

May it helps: https://github.com/banago/PHPloy
Looks this tool designed for php project, but I think it can use other web deployment.
how it works:
PHPloy stores a file called .revision on your server. This file contains the hash of the commit that you have deployed to that server. When you run phploy, it downloads that file and compares the commit reference in it with the commit you are trying to deploy to find out which files to upload. PHPloy also stores a .revision file for each submodule in your repository.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string