"Unstable" NFS mount point - linux

First of all, this is the first time I'm posting a question on StackOverflow, so please don't kill me if I've done anything wrong.
There goes my issue:
We have few dedicated servers with a well known French provider. With one of those servers ewe have recently acquired a 5.000GB backup space which can be mounted via NFS, and that's what we've done.
The issue comes when backing up big files. Every night we back up several VM's running on that host and we know from fact that the backups are not being properly done (the file size differs a lot from one day to the other plus we've checked the content of the backup and there's stuff missing).
So, it seems like the mount point is not stable and the backups are not being properly done. Seems like there are micro network cuts and therefore the hypervisor finishes the current backup and starts with the next one.
This is how it's mounted right now:
xxx.xxx.xxx:/export/ftpbackup/xxx.ip-11-22-33.eu/ /NFS nfs auto,timeo=5,retrans=5,actimeo=10,retry=5,bg,soft,intr,nolock,rw,_netdev,mountproto=tcp 0 0
Any advise? Is there any parameter you would change?
We need to be sure that the NFS mount point is correctly working in order to have proper backups.
Thank you so much

By specifying "soft" as an option, you're saying that it's OK for the mount to be unreliable -- for the kernel to return an I/O error instead of running the I/O to completion when things are taking too long. Using a hard mount, without the "soft" option instructs the kernel to avoid returning I/O errors for timeouts.
This will fix your corrupted backups, but... your backup process will hang hard until I/O's complete. An alternative is to use much longer timeout values.
You're using TCP for the mount protocol, but not for NFS itself. If your server supports it, consider adding "tcp" to the options line.

Related

Proxmox VE: How to create a raw disk and pass it through to a VM

I am searching for an answer on how to create and pass through a raw device to a VM using proxmox. Through that I am hoping to have full control of the disk including S.M.A.R.T. stats and disk spindown.
Currently I am using passthrough using the SATA passthrough offered by proxmox.
Unfortunately I have no clue how to create a raw disk file from my (empty) disk). Furthermore I am not entirely certain on how to bind it to the VM.
I hope someone knows the relevant steps.
Side notes:
This question is just a measure I want to try out to achieve a certain goal. For the sake of simplicity I posed my question confined to the part above. However, if you have a better idea, feel free to give me a hint. So far I have tried a lot of things to achieve my ultimate goal.
Goal that I want to achieve:
I am using Proxmox VE 5.3-8 on a HP Proliant Gen 8 server. It hosts several VMs among which OMV should serve as a NAS. Since the files will not be accessed too often, I opt for a spindown of the drives.
My goal is reduction of noise and power savings.
Current status:
I passed through two disks by adding them to
/etc/pve/nodes/pve/qemu-server/vmid.conf
sata1: /dev/disk/by-id/{disk-id}
Through that I do see SMART stats and everything except disk spindown works fine. Using virtio instead of SATA does not give me SMART values.
using hdparm -y to put a drive to sleep does not work inside the VM. Doing the same on the proxmox console result in a sleep, but it wakes up a few seconds later.
Passing through the entire HBA is currently not an option.
I read in a forum that first installing Debian and then manually installing the proxmox packages resulted in a success. However that was still for Debian jessie and three years ago.
Install Proxmox VE on Debian Stretch
Before I try this as a last resort, I want to make sure if passing the disk through as a raw file will lead to the result.
Maybe someone has an idea on how to achieve my ultimate goal.
I do not have a clear answer to your question, as per "passing through" the disk, but i recently found a good enough solution for my use case.
I have an HDD that i planned to use as a backup dir for VMs, but i also wanted to put any kind of data on it, and share that disk with any VM that would like to.
The solution i found is to format the disk using ZFS, then creating mount points for different usage (vzdump backup, shared nas folder accross VMs + ISO mounting point etc...). I followed this guide: https://forum.level1techs.com/t/how-to-create-a-nas-using-zfs-and-proxmox-with-pictures/117375
I ended up installing samba on proxmox host itself, with a config to share some folder/mount point of the disk, via SMB. Now the device appears as a normal disk over the network, with excellent read/write speed as everything is local.
Sorry that this post does not "answer" your question (no SMART data or things low level like that :'( ) BUT shared storage ^^'

Unable to increase disk size on file system

I'm currently trying to log in to one of the instances created on google cloud, but found myself unable to do so. Somehow the machine escaped my attention and the hard disk got completely full. Of course I wanted to free some disk space and make sure the server running could restart, but I am facing some issues.
First off, I have found the guide on increasing the size of the persistent disk (https://cloud.google.com/compute/docs/disks/add-persistent-disk). I followed that and already set it 50 GB which should be fine for now.
However, on file system level because my disk is full I cannot make any SSH connection. The error is simply a timeout caused by the fact that there is absolutely no space for the SSH deamon to write to its log. Without any form of connection I cannot free some disk space and/or run the "resize2fs" command.
Furthermore, I already tried different approaches.
I seem to not be able to change the boot disk to something else.
I created a snapshot and tried to increase the disk size on the new
instance I created from that snapshot, but it has the same problem
(filesystem is stuck at 15GB).
I am not allowed to mount the disk as an additional disk in another
instance.
Currently I'm pretty much out of ideas. The important data on the disk was back-upped but I'd rather have the settings working as well. Does anyone have any clues as where to start?
[EDIT]
Currently still trying out new things. I have also tried to run shutdown- and startup scripts that remove /opt/* in order to free some temporary space but the script either don't run or provide some error I cannot catch. It's pretty frustrating working nearly blind I must say.
The next step for me would be to try and get the snapshot locally. It should be doable using the bucket but I will let you know.
[EDIT2]
Getting a snapshot locally is not an option either or so it seems. Images from the google cloud instances can only be created or deleted, but not downloaded.
I'm now out of ideas.
So I finally found the answer. These steps were taken:
In the GUI I increased the size of the disk to 50 GB.
In the GUI I detached the drive by deleting the machine whilst
ensuring that I did not throw away the original disk.
In the GUI I created a new machine with a sufficiently big harddisk.
On the command line (important!!) I attached the disk to the newly
created machine (the GUI option has a bug still ...)
After that I could mount the disk as a secondary disk and perform all the operations I needed.
Keep in mind: By default google cloud solutions do NOT use logical volume management, so pvresize/lvresize/etc. is not installed and resize2fs might not work out of the box.

hung_task_timeout_secs error during copy to a mount point in linux

I am trying to copy data files from my VM to a NFS VM- ZFS Storage(Both VM's can talk to each other). During copy sometimes I encounter error:
INFO: task cp: blocked for more than 120 seconds .
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs disables this message"
Both my VM's hang and I have to restart them. If I copy again it works.
I have around 233 data files to copy and its becoming difficult to restart VM's again and again.
I looked at the solutions given on internet and changed the vm.dirty_ratio to 5 and vm.dirty_background_ratio to 10 to resolve but it did not work.
I am running these VM's on virtual box and allocated around 17GB RAM for one and the NFS VM around 6GB RAM.
Any hack which could help me in copying these files to the NFS without my VM's hanging?
I am sorry if I am answering an answer with another answer, but this case has many variables that need exploring.
1, you have a Linux VM sharing your storage (assumption)
A. Which distro ? 32 or 64 bits ? When the problem happens what does top reports for system load ?
B. Local storage or nas ? Or San ?
C. Which version of NFS ? 3 or 4 ?
D. Can you set the variables of your mount when mapping the NFS share? You might want to play with rsize and wsize, setting them to at least 64000. I would recommend also setting noatime and nodiratime on the share.
E. From my VMware background with Gluster, there are some timeout/refresh settings you can set on the storage side. How often the storage publishes its presence, telling it is alive. A good start is 20 seconds.
F. VMware can tell you how much latency you have for read or write on a physical and on a VM level. Try to figure out those to know who to blame.
Ah, and, of course, make sure your Linux VM has the latest patches applied.
Let's see where we get from here.

mount a drive when rsync connects to server

Background first:
I am working ~120km from home and therefore live in a apartment during the week.
I want to sync my NAS at home to a large (and cheap) HDD I use in the apartment.
Trouble is: The NAS is a WHS-NAS thats getting quiet slow and needs to be replaced sometime soon.
Good news is: There is a Linux-Server (openSUSE) running 24/7.
So my idea was:
Mount alle the relevant NAS-shares on the linux-server and then sync from there.
That works fine except for the powermanagement.
If I don't use the NAS the powermanagement kicks in and sends it to StandBy.
If that happens while the shares are mounted, the next rsync will believ that the folders are empty (because the mount points still exist but have no data)
Currently I log in via SSH and make sure everything is mounted before syncing, but that is just quick and dirty.
I could change the powermanagement to 24/7, but that would be quick and expensive and dirty.
I am here searching for a clean solution. My idea was, as title suggests, that the linux-server should recognize a rsync-login and react to it by waking the NAS and mounting the shares.
I have some scripts, that would do the job, but I can't find a place for them to put, so they would be called on a rsync-login.
Flow of my idea would be something like
Client.RSYNC.Connect --> Server.RSYNC.Receive --> NAS.Wake --> Server.NAS.Mount --> Server.RSYNC.Connected/Disconnect(if NAS unavailable)
Is something like that even possible or does someone have a good solution for the problem (except a 24/7-NAS and manual work)
Kind regards
Ingo
If your server is running OpenSUSE >12.2 with systemd you can create a systemd socket for an rsync server with a preexec that mounts the NAS.

using torrents to back up vhd's

Hi it's a question and it may be redundant but I have a hunch there is a tool for this - or there should be and if there isn't I might just make it - or maybe I am barking up the wrong tree in which case correct my thinking:
But my problem is this: I am looking for some way to migrate large virtual disk drives off a server once a week via an internet connection of only moderate speed, in a solution that must be able to be throttled for bandwidth because the internet connection is always in use.
I thought about it and the problem is familar: large files that can moved that also be throttled that can easily survive disconnection/reconnection/large etc etc - the only solution I am familiar with that just does it perfectly is torrents.
Is there a way to automatically strategically make torrents and automatically "send" them to a client download list remotely? I am working in Windows Hyper-V Host but I use only Linux for the guests and I could easily cook up a guest to do the copying so consider it a windows or linux problem.
PS: the vhds are "offline" copies of guest servers by the time I am moving them - consider them merely 20-30gig dum files.
PPS: I'd rather avoid spending money
Bittorrent is an excellent choice, as it handles both incremental updates and automatic resume after connection loss very well.
To create a .torrent file automatically, use the btmakemetainfo script found in the original bittorrent package, or one from the numerous rewrites (bittornado, ...) -- all that matters is that it's scriptable. You should take care to set the "disable DHT" flag in the .torrent file.
You will need to find a tracker that allows you to track files with arbitrary hashes (because you do not know these in advance); you can either use an existing open tracker, or set up your own, but you should take care to limit the client IP ranges appropriately.
This reduces the problem to transferring the .torrent files -- I usually use rsync via ssh from a cronjob for that.
For point to point transfers, torrent is an expensive use of bandwidth. For 1:n transfers it is great as the distribution of load allows the client's upload bandwidth to be shared by other clients, so the bandwidth cost is amortised and everyone gains...
It sounds like you have only one client in which case I would look at a different solution...
wget allows for throttling and can resume transfers where it left off if the FTP/http server supports resuming transfers... That is what I would use
You can use rsync for that (http://linux.die.net/man/1/rsync). Search for the --partial option in man and that should do the trick. When a transfer is interrupted the unfinished result (file or directory) is kept. I am not 100% sure if it works with telnet/ssh transport when you send from local to a remote location (never checked that) but it should work with rsync daemon on the remote side.
You can also use that for sync in two local storage locations.
rsync --partial [-r for directories] source destination
edit: Just confirmed the crossed out statement with ssh

Resources