rm not freeing diskspace [closed] - linux

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I've rm'ed a 2.5gb log file - but it doesn't seemed to have freed any space.
I did:
rm /opt/tomcat/logs/catalina.out
then this:
df -hT
and df reported my /opt mount still at 100% used.
Any suggestions?

Restart tomcat, if the file is in use and you remove it, the space becomes available when that process finishes.

As others suggested, the file probably is still opened by other processes. To find out by which ones, you can do
lsof /opt/tomcat/logs/catalina.out
which lists you the processes. Probably you will find tomcat in that list.

Your Problem:
Its possible that a running program is still holding on to the file.
Your Solution:
Per the other answers here, you can simply shutdown tomcat to stop it from holding on to the file.
If that is not an option, or if you simply want more details, check out this question: Find and remove large files that are open but have been deleted - it suggests some harsher ways to deal with it that may be more useful to your situation.
More Details:
The linux/unix filesystem considers "opened" files to be another name for them. rm removes the "name" from the file as seen in the directory tree. Until the handles are closed, the files still has more "names" and so the file still exists. The file system doesn't reap files until they are completely unnamed.
It might seem a little odd, but doing it this way allows for useful things like enabling symlinks. Symlinks can essentially be treated as an alternate name for the same file.
This is why it is important to always call your languages equivalent to close() on a file handle if you are done with it. This notifies the OS that the file is no longer being used. Although sometimes this cant be helped - which is likely the case with Tomcat. Refer to Bill Karwin's Answer to read why.
Depending on the file-system, this is usually implemented as a sort of reference count, so there may not be any real names involved. It can also get weird if things like stdin and stderr are redirected to a file or another bytestream (most commonly done with services).
This whole idea is closely related to the concept of 'inodes', so if you are the curious type, i'd recommend checking that out first.
Discussion
It doesn't work so well anymore, but you used to be able to update the entire OS, start up a new http-daemon using the new libraries, and finally close the old one when no more clients are being serviced with it (releasing the old handles) . http clients wouldn't even miss a beat.
Basicly, you can completely wipe out the kernel and all the libraries "from underneath" running programs. But since the "name" still exists for the older copies, the file still exists in memory/disk for that particular program. Then it would be a matter of restarting all the services etc. While this is an advanced usage scenario, it is a reason why some unix system have years of up-time on record.

Restarting Tomcat will release any hold Tomcat has on the file. However, to avoid restarting Tomcat (e.g. if this is a production environment and you don't want to bring the services down unncessarily), you can usually just overwrite the file:
cp /dev/null /opt/tomcat/logs/catalina.out
Or even shorter and more direct:
> /opt/tomcat/logs/catalina.out
I use these methods all the time to clear log files for currently running server processes in the course of troubleshooting or disk clearing. This leaves the inode alone but clears the actual file data, whereas trying to delete the file often either doesn't work or at the very least confuses the running process' log writer.

As FerranB and Paul Tomblin have noted on this thread, the file is in use and the disk space won't be freed until the file is closed.
The problem is that you can't signal the Catalina process to close catalina.out, because the file handle isn't under control of the java process. It was opened by shell I/O redirection in catalina.sh when you started up Tomcat. Only by terminating the Catalina process can that file handle be closed.
There are two solutions to prevent this in the future:
Don't allow output from Tomcat apps to go into catalina.out. Instead use the swallowOutput property, and configure log channels for output. Logs managed by log4j can be rotated without restarting the Catalina process.
Modify catalina.sh to pipe output to cronolog instead of simply redirecting to catalina.out. That way cronolog will rotate logs for you.

the best solution is using 'echo' ( as #ejoncas' suggestion ):
$ echo '' > huge_file.log
This operation is quite safe and fast(remove about 1G data per second), especially when you are operating on your production server.
Don't simply remove this file using 'rm' because firstly you have to stop the process writing it, otherwise the disk won't be freed.
refer to: http://siwei.me/blog/posts/how-to-deal-with-huge-log-file-in-production
UPDATED: the origin of my story
in 2013, when I was working for youku.com, on the Saturday, I found one core server was down, the reason is : disk is full ( with log files)
so I simplely rm log_file.log ( without stopping the web app proccess) but found: 1. no disk space was freed and: 2. the log file was actually not seen to me.
so I have to restart my web-server( an Rails app ) and the disk space was finally freed.
This is a quite important lesson to me. It told me that echo '' > log_file.log is the correct way to free disk space if you don't want to stop the running process which is writing log to this file.

If something still has it open, the file won't actually go away. You probably need to signal catalina somehow to close and re-open its log files.

If there is a second hard link to the file then it won't be deleted until that is removed as well.

Enter the command to check which deleted files has occupied memory
$ sudo lsof | grep deleted
It will show the deleted file that still holds memory.
Then kill the process with pid or name
$ sudo kill <pid>
$ df -h
check now you will have the same memory
If not type the command below to see which file is occupying memory
# cd /
# du --threshold=(SIZE)
mention any size it will show which files are occupying above the threshold size and delete the file

Is the rm journaled/scheduled? Try a 'sync' command for force the write.

Related

What happen when delete shared memory files in dev/shm by using 'rm' command

I used Posix shared memory to communicate between 2 process. Then during 2 process were sharing data, I used 'rm' command to remove all shared file which mounted in dev/shm. I expected some errors will happen, but everything still work normal.
So I have a question as below:
What happen when I using rm command line to delete all shared memory files in dev/shm directory.
I have googled but cannot find anywhere discuss about this situation.
Can anyone please explain to me about it?
Thanks so much

How to remove a data from main memory/RAM

Is there any way to remove a specific data from the main memory in Linux ? So that is has to bring again that data from the hard-disk
Rather than removing a specific piece of data, this will remove all from the linux cache. (I am assuming this is what you mean when you want linux to reload a file from the hard disk into main memory).
sudo sh -c "sync; echo 3 > /proc/sys/vm/drop_caches"
Link: How to clear memory cache on linux
Also, my apologies to Mohit M. as he answered this in the comment section before me.
You could always change your app to use O_DIRECT flag which bypasses the page cache and fetches file from diskc for a specific read call. Now it may not work in all cases (especially in case of stacked block devices etc) in which case you should stick to drop_caches like other before me explained.

centos free space on disk not updating

I am new to the linux and working with centos system ,
By running command df -H it is showing 82% if full, that is only 15GB is free.
I want some more extra spaces, so using WINSCP i hav done shift deleted the 15G record.
and execured df -H once again, but still it is showing 15 GB free. but the free size of the deleted
file where it goes.
Plese help me out in finding solution to this
In most unix filesystems, if a file is open, the OS will delete the file right way, but will not release space until the file is closed. Why? Because the file is still visible for the user that opened it.
On the other side, Windows used to complain that it can't delete a file because it is in use, seems that in later incarnations explorer will pretend to delete the file.
Some applications are famous for bad behavior related to this fact. For example, I have to deal with some versions of MySQL that will not properly close some files, over the time I can find several GB of space wasted in /tmp.
You can use the lsof command to list open files (man lsof). If the problem is related to open files, and you can afford a reboot, most likely it is the easiest way to fix the problem.

Is moving a file safer than deleting it if you want to remove all traces of it? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I recently accidentally called "rm -rf *" on a directory an deleted some files that I needed. However, I was able to recover most of them using photorec. Apparently, "deleting" a file just removes references to said file and is not truly deleted until it is overwritten by something else.
So if I wanted to remove the file completely, couldn't I just execute
mv myfile.txt /temp/myfile.txt
(or move to external storage)
You should consider using the Linux command shred, which overwrites the target file multiple times before deleting it completely, which makes it 'impossible' to recover the file.
You can read a bit about the shred command here.
Just moving the file does not cover you for good, if you moved it to external storage, the local version of the file is deleted just as it is with the rm command.
No. that won't help either.
A move when going between file systems is really still just a "copy + rm" internally. The original storage location of the file on the "source" media is still there, just marked as available. A moving WITHIN a file system doesn't touch the file bytes at all, it just updates the bookkeeping to say "file X is now in location Y".
To truly wipe a file, you must overwriteall of its bytes. And yet again, technology gets in the way of that - if you're using a solid state storage medium, there is a VERY high chance that writing 'garbage' to the file won't touch the actual transistors the file's stored in, but actually get written somewhere completely different.
For magnetic media, repeated overwriting with alternating 0x00, 0xFF, and random bytes will eventually totally nuke the file. For SSD/flash systems, it either has to offer a "secure erase" option, or you have to smash the chips into dust. For optical media, it's even more complicated. -r media cannot be erased, only destroyed. for -rw, I don't know how many repeated-write cycles are required to truly erase the bits.
No (and not just because moving it somewhere else on your computer is not removing it from the computer). The way to completely remove a file is to completely overwrite the space on the disk where it resided. The linux command shred will accomplish this.
Basically, no, in most file systems you can't guarantee that a file is overwritten without going very low level. Removing a file and/or moving it will only change the pointer to the file, not the files existence in the file system in any way. Even the linux command shred won't guarantee a file's removal in many file systems since it assumes files are overwritten in place.
On SSDs, it's even more likely that your data stays there for a long time, since even if the file system would attempt to overwrite blocks, the SSD will remap to write to a new block (erasing takes a lot of time, if it wrote in place things would be very slow)
In the end, with modern file systems and disks, the best chance you have to have files stored securely is to keep them encrypted to begin with. If they're stored anywhere in clear text, they can be very hard to remove, and recovering an encrypted file from disk (or a backup for that matter) won't be much use to anyone without the encryption key.

How do I measure net used disk space change due to activity by a given process in Linux?

I'd like to monitor disk space requirements of a running process. Ideally, I want to be able to point to a process and find out the net change in used disk space attributable to it. Is there an easy way of doing this in Linux? (I'm pretty sure it would be feasible, though maybe not very easy, to do this in Solaris with DTrace)
Probably you'll have to ptrace it (or get strace to do it for you and parse the output), and then try to work out what disc is being used.
This is nontrivial, as your tracing process will need to understand which file operations use disc space - and be free of race conditions. However, you might be able to do an approximation.
Quite a lot of things can use up disc space, because most Linux filesystems support "holes". I suppose you could count holes as well for accounting purposes.
Another problem is knowing what filesystem operations free up disc space - for example, opening a file for writing may, in some cases, truncate it. This clearly frees up space. Likewise, renaming a file can free up space if it's renamed over an existing file.
Another issue is processes which invoke helper processes to do stuff - for example if myprog does a system("rm -rf somedir").
Also it's somewhat difficult to know when a file has been completely deleted, as it might be deleted from the filesystem but still open by another process.
Happy hacking :)
If you know the PID of the process to monitor, you'll find plenty of information about it in /proc/<PID>.
The file /proc/<PID>/io contains statistics about bytes read and written by the process, it should be what you are seeking for.
Moreover, in /proc/<PID>/fd/ you'll find links to all the files opened by your process, so you could monitor them.
there is Dtrace for linux is available
http://librenix.com/?inode=13584
Ashitosh

Resources