How to persist data and schema when a Cassandra Container is deleted - cassandra

I create a Cassandra Container, run it, create schema and add data to it. When I later stop the Container and delete it and create a new one from the image, the previously created schema and data is lost.
Can I/How can I persist data and schema such that whenever a container is created from an image, it picks existing schema and data?
Do would I have to recreate the schema and add data?

To do this, I used -v option to mount an external directory (external outside the container).
Eg.
For container 1 (node1)
docker run -v C:\...\dockertest\dockerdata\cassandra1:/var/lib/cassandra --network cassandra-net -p 9042:9042 --name cassandra1 cassandra_image_id
For container 2 (node2)-
docker run -v C:\...\dockertest\dockerdata\cassandra2:/var/lib/cassandra --network cassandra-net --name cassandra2 -e CASSANDRA_SEEDS=cassandra1 cassandra_image_id
If running both containers on the same machine, the external directory must be different for each node/container

Related

How to mount a host directory into a running docker container

I want to mount my usb drive into a running docker instance for manually backup of some files.
I know of the -v feature of docker run, but this creates a new container.
Note: its a nextcloudpi container.
You can only change a very limited set of container options after a container starts up. Options like environment variables and container mounts can only be set during the initial docker run or docker create. If you want to change these, you need to stop and delete your existing container, and create a new one with the new mount option.
If there's data that you think you need to keep or back up, it should live in some sort of volume mount anyways. Delete and restart your container and use a -v option to mount a volume on where the data is kept. The Docker documentation has an example using named volumes with separate backup and restore containers; or you can directly use a host directory and your normal backup solution there. (Deleting and recreating a container as I suggested in the first paragraph is extremely routine, and this shouldn't involve explicit "backup" and "restore" steps.)
If you have data that's there right now that you can't afford to lose, you can docker cp it out of the container before setting up a more robust storage scheme.
As David Maze mentioned, it's almost impossible to change the volume location of an existing container by using normal docker commands.
I found an alternative way that works for me. The main idea is convert the existing container to a new docker image and initialize a new docker container on top of it. Hope works for you too.
# Create a new image from the container
docker commit CONTAINERID NEWIMAGENAME
# Create a new container on the top of the new image
docker run -v HOSTLOCATION:CONTAINERLOCATION NEWIMAGENAME
I know the question is from May, but for future searchers:
Create a mounting point on the host filesystem:
sudo mkdir /mnt/usb-drive
Run the docker container using the --mount option and set the "bind propagation" to "shared":
docker run --name mynextcloudpi -it --mount type=bind,source=/mnt/usb-drive,target=/mnt/disk,bind-propagation=shared nextcloudpi
Now you can mount your USB drive to the /mnt/usb-drive directory and it will be mounted to the /mnt/disk location inside the running container.
E.g: sudo mount /dev/sda1 /mnt/usb-drive
Change the /dev/sda1, of course.
More info about bind-propagation: https://docs.docker.com/storage/bind-mounts/#configure-bind-propagation

docker volume : data lost when container is removed or stopped and then restarted

I'm have mounted 2 containers :
Mongo : as the database -> the volume is mounted on a specified path
docker run --hostname raw-data-mongo --name=raw-data-mongo
--network=our-network -p 27017:27017 -v data/db/raw-data-db:/data/db -d mongo
Nodejs: save some data to mongodb
docker run -d -p 80:3001 --name=raw-data-container
--network=our-network -v /data/db/raw-data-db:/data/db raw-data-api
But any time I save data (using my node app container, connecting to mongo container) and then remove my node container and restart, I can't retrieve the data I saved.
How should I do to mount a volume that is independent of the node, and if possible independent from any container?
I wish I could remove and restart containers and still be able to retrieve my data
You can run use -v to specify the volume against which you want your docker deployment to save your data. Two things you should keep in mind is, one is that the docker that you are running should be exposing the -v and using this to store its data (mongodb does this, and so does most projects) and another the persistence volume path passed with -v should exist and the docker process should have write permissions on it.
Hope this helps !!
Please try as following
Step 1: Make the directory
$HOME/data/db/raw-data-db
Step 2: Then assign the volume
docker run --hostname raw-data-mongo --name=raw-data-mongo
--network=our-network -p 27017:27017 -v $HOME/data/db/raw-data-db:/data/db -d mongo
In this way, your data will be persisted for mongodb in the volume. For node.js use different volume.

build docker image (centos) with volume with data for testing (postgresql)

I want to make image with persisted data to other developers can run this container and use it.
I use this https://hub.docker.com/r/centos/postgresql-96-centos7/ and do next step:
1) Run it with command
docker run -d -e POSTGRESQL_USER=svcg -e POSTGRESQL_PASSWORD=svcg1 -e POSTGRESQL_DATABASE=bcs -p 5432:5432 --expose=5432 --name=postgres --net=docker-net -v psql:/var/lib/pgsql/data centos/postgresql-96-centos7
The result is a running container "postgres" and volume "psql".
2) Restore data from the dumpName.xml
java -jar database_backup_starter.jar -u svcg -p svcg1 -url jdbc:postgresql://192.168.99.100:5432/bcs -m restore -path database_dump
The volume "psql" contain the testing data.
Is it possible to do it in one step ? For instance to create image from the volume or from the docker with volume with data? How to do it?
Since this image declares a VOLUME for the PostgreSQL data, you cannot make a derived image from this with prepopulated data in a volume, either using docker build or docker commit. It won't work (see the note "changing the volume within the Dockerfile").
The standard way to load data into a newly created database container seems to be to put a SQL-syntax database dump in the initialization directory when you start the container for the first time. In this image, it looks like you'd have to wrap the database dump in a shell script and put it in an /opt/app-root/src/postgresql-init/ directory. For the stock postgres image you'd put a *.sql file in /docker-entrypoint-initdb.d/.

Stop VM with MongoDB docker image without losing data

I have installed the official MongoDB docker image in a VM on AWS EC2, and the database has already data on it. If I stop the VM (to save expenses overnight), will I lose all the data contained in the database? How can I make it persistent in those scenarios?
There are multiple options to achieve this but the 2 most common ways are:
Create a directory on your host to mount the data
Create a docker
volume to mount the data
1) Create a data directory on a suitable volume on your host system, e.g. /my/own/datadir. Start your mongo container like this:
$ docker run --name some-mongo -v /my/own/datadir:/data/db -d mongo:tag
The -v /my/own/datadir:/data/db part of the command mounts the /my/own/datadir directory from the underlying host system as /data/db inside the container, where MongoDB by default will write its data files.
Note that users on host systems with SELinux enabled may see issues with this. The current workaround is to assign the relevant SELinux policy type to the new data directory so that the container will be allowed to access it:
$ chcon -Rt svirt_sandbox_file_t /my/own/datadir
The source of this is the official documentation of the image.
2) Another possibility is to use a docker volume.
$ docker volume create my-volume
This will create a docker volume in the folder /var/lib/docker/volumes/my-volume. Now you can start your container with:
docker run --name some-mongo -v my-volume:/data/db -d mongo:tag
All the data will be stored in the my-volume so in the folder /var/lib/docker/my-volume. So even when you delete your container and create a new mongo container linked with this volume your data will be loaded into the new container.
You can also use the --restart=always option when you perform your initial docker run command. This mean that your container automatically will restart after a reboot of your VM. When you've persisted your data too there will be no difference between your DB before or after the reboot.

Why use a data-only container over a host mount?

I understand the concept of data-only containers
But why would you use a data-only container over a simple host mount given that data-only containers seem to make it harder to find the data.
When you don't want to manage the mount yourself and don't need to find the data frequently. Good example is database containers, where using data-only container provides you with the following conveniences:
No need to even know what are the volumes that you have to create for a mature container, e.g.
docker run --name my-data tutum/mysql:5.5 true
docker run -d --name my --volumes-from my-data tutum/mysql:5.5
Simplified management via docker. You don't have to manually delete the host directory or create a new path when you need to start anew.

Resources