The local cache for Azure CloudDrive is great for performance.
I recreate a new snapshot every 20 minutes, with little change or maybe no change at all in each snapshot.
To use the new snapshot, what I done is:
cloudDrive.Unmount();
cloudDrive = storageAccount.CreateCloudDrive(newSnapshotUri);
cloudDrive.Mount(size, option);
I like to know, will the old cache still be use for the new mount snapshot? Or the whole cache has to be rebuild again?
This is purely an educated guess, but I assume the cache is rebuilt. My reasoning is that you can't know when you mount the new snapshot what data is the same and what's different, so it would be impossible to figure out what to invalidate in the cache.
Related
We have a VPS running on Google Cloud which had a very important folder in a user directory. An employee of ours deleted that folder and we can't seem to figure out how to recover it. I came across extundelete but it seems the partition needs to be unmounted for it to work but I don't understand how I would do it on Google. This project took more than a year and that was the latest copy after a fire which took out the last copy from our local servers.
Could anyone please help or guide me in the right direction?
Getting any files back from your VM's disk may be tricky (at best) or impossible (most probably) if the files got overwritten.
Easiest way would be to get them back from a copy or snapshot of your VM's disk. If you have a snapshot of your disk (either taken manually or automatically) from before when the folder in question got delete then you will get your files back.
If you don't have any backups then you may try to recover the files - I've found many guides and tutorials, let me just link the ones I believe would help you the most:
Unix/Linux undelete/recover deleted files
Recovering accidentally deleted files
Get list of files deleted by rm -rf
------------- UPDATE -----------
Your last chance in this battle is to make two clones of the disk
and then detach original disk from the VM and attach one of the clones to keep your VM running. Then use second clone for any experiments. Keep the original untouched in case you mess up the second clone.
Now create a new Windows VM and attach your second clone as the additional disk. At this moment you're ready to try various data redovery software;
UFS Explorer
Virtual Machine Data Recovery
There are plenty of others to try from too.
Another approach would be to create an image from the original disk and export it as a VMDK imagae (and save it to a storage bucket). Then download it to yor local computer and then use for example VMware VMDK Recovery or other specialized software for extracting data from virtual machines disk images.
I'm currently trying out Azure AKS and during setup I obviously also want to make backups. For this the best practice seems to be velero. According to the documentation of velero to include pv snapshots you would annotate the pod/deployment. Example:
backup.velero.io/backup-volumes: wp-pv
Note the above is when using a static managed disk. I can see the snapshot is created. However, when I do a restore a new pv is created instead of using the one from the restore. Is this expected behavior?
Ideally, I would like to use dynamic pv's instead but this would make it even more trivial because I don't know what name the pv will have and thus can't add proper annotations beforehand.
How can I solve this in a clean way? My ideal situation would be to have scheduled backups using velero and in case of a recovery automatically have it use the snapshot as base for the pv instead of it creating a new one that doesn't contain my data. For now, it seems this is a manual procedure? Am I missing something?
This is by design.
PersistantVolumes by definitions can only ever belong to one PVC claimant. Even when set as dynamic.
I think what you want is to have the reclaim policy set to retain. See here:
https://kubernetes.io/docs/concepts/storage/persistent-volumes/#retain
A state of "Retain" should mean that the PVs data persists, it is just needing to be reclaimed by a new PV/PVC. The AKS should pick up on this... But I've only ever done this with AWS/Baremetal
In this case Velero, rightly, has to both recreate the PVC and PV for the volume to be released and reassigned to the new claimant.
I'm currently trying to log in to one of the instances created on google cloud, but found myself unable to do so. Somehow the machine escaped my attention and the hard disk got completely full. Of course I wanted to free some disk space and make sure the server running could restart, but I am facing some issues.
First off, I have found the guide on increasing the size of the persistent disk (https://cloud.google.com/compute/docs/disks/add-persistent-disk). I followed that and already set it 50 GB which should be fine for now.
However, on file system level because my disk is full I cannot make any SSH connection. The error is simply a timeout caused by the fact that there is absolutely no space for the SSH deamon to write to its log. Without any form of connection I cannot free some disk space and/or run the "resize2fs" command.
Furthermore, I already tried different approaches.
I seem to not be able to change the boot disk to something else.
I created a snapshot and tried to increase the disk size on the new
instance I created from that snapshot, but it has the same problem
(filesystem is stuck at 15GB).
I am not allowed to mount the disk as an additional disk in another
instance.
Currently I'm pretty much out of ideas. The important data on the disk was back-upped but I'd rather have the settings working as well. Does anyone have any clues as where to start?
[EDIT]
Currently still trying out new things. I have also tried to run shutdown- and startup scripts that remove /opt/* in order to free some temporary space but the script either don't run or provide some error I cannot catch. It's pretty frustrating working nearly blind I must say.
The next step for me would be to try and get the snapshot locally. It should be doable using the bucket but I will let you know.
[EDIT2]
Getting a snapshot locally is not an option either or so it seems. Images from the google cloud instances can only be created or deleted, but not downloaded.
I'm now out of ideas.
So I finally found the answer. These steps were taken:
In the GUI I increased the size of the disk to 50 GB.
In the GUI I detached the drive by deleting the machine whilst
ensuring that I did not throw away the original disk.
In the GUI I created a new machine with a sufficiently big harddisk.
On the command line (important!!) I attached the disk to the newly
created machine (the GUI option has a bug still ...)
After that I could mount the disk as a secondary disk and perform all the operations I needed.
Keep in mind: By default google cloud solutions do NOT use logical volume management, so pvresize/lvresize/etc. is not installed and resize2fs might not work out of the box.
This should be easy but ...
I have been working on a Google Compute Engine persistent disk image that I'm calling utilserver, and basically now need to build it again from scratch, but I might need the original one to try a few things in case problems come up. So I'd like to rename utilserver to utilserver-backup and then create a new utilserver that will hopefully end up being more correct. However, under the web console for my project there's only a "Delete" button, no "Rename" button. Neither does gcutil seem to have a rename command. Ok, I tried creating a snapshot of utilserver and then from that a new persistent disk called utilserver-backup, but when I did that the new disk looked like a completely new image--none of my prior installation work was on there. Any ideas here?
You can create a snapshot of your disk and then can create multiple disk from that snapshot. By creating the snapshot you will have the backup of your original disk. You can then delete the original disk and create a new one with the same name. You can refer to the following link for more details about snapshot: https://cloud.google.com/compute/docs/disks/create-snapshots
I personally have tried creating a new disk from snapshot using the following command and it created a new disk with all my data
gcutil adddisk <disk-name> --project=<project-id> --source_snapshot=<snapshot-name>
gcutil has been deprecated in favor of gcloud compute.
gcloud compute disks create <new-disk-name> --source-snapshot <snapshot-name> --zone=<zone-name>
Example:
gcloud compute disks create production --source-snapshot production-backup-2023-01-23 --zone=asia-southeast1-b
I know that we can use the VM Depot to get started with the Neo4J in Azur but one thing that is not clear is where should we physically store the DB files. I tried to look around in the net if there are any recommendations on where the physical files would be stored so that then a VM crashes or restarts, the data is not lost.
can someone share their thoughts or point me to a address where some more details can be found on do and don'ts of Neo4j on Azure for a production environment.
Regards
Kiran
When you set up a Neo4j VM via VM Depot, that image, by default, configures the database files to reside within the same VM as the server itself. The location is specified in neo4j-server.properties. This lets you simply spin up the VM and start using Neo4j immediately.
However: You'll soon discover that your storage space is limited (I believe the VM instances are set up with a 127GB disk). To work with larger databases, you'll need to attach an additional disk (or disks), each disk up to 1TB in size. These disks, as well as the main VM disk, are backed by blob storage, meaning they're durable - persistent disks.
How you ultimately configure this is up to you, depending on the size of the database and its purpose. The only storage to avoid, if you need persistence, is the scratch disk provided (which is a locally-attached drive with no durability).
The documentation announcing that VM doesn't say. But when you install neo4j as a package on to other similar linux systems (the VM in question is a linux VM) then the data usually goes into /var/lib/neo4j/data. Here's an example:
user#host:/var/lib/neo4j/data$ pwd
/var/lib/neo4j/data
user#host:/var/lib/neo4j/data$ ls
graph.db keystore log neo4j-service.pid README.txt rrd
user#host:/var/lib/neo4j/data$ cat README.txt
Neo4j Data
=======================================
This directory contains all live data managed by this server, including
database files, logs, and other "live" files.
The main directory you really have to have is the "graph.db" directory. That's going to contain the bulk of the data. May as well back up the entirety of this directory. Some of the files (like the .pid file and the README.txt) of course aren't needed.
Now, there's no guarantee that in the VM that it's going to be /var/lib/neo4j/data but it's going to be something very similar. And what you're going to want is going to be a directory whose name ends in .db since that's the default for new neo4j databases.
To narrow down further, once you get that VM running, just run updatedb then locate *.db | grep neo4j and that's almost certain to find it quickly.