Bash - delete files by date/filename - linux

I have a bash script which creates a mysqldump backup every hour in a certain directory.
The filenames of the backup files include the date and hour as per the following schema:
backupfile_<day>-<month>-<year>_<hour>.sql.gz
and to clarify here are some example filenames:
backupfile_30-05-2012_0800.sql.gz
backupfile_01-06-2012_0100.sql.gz
backupfile_05-06-2012_1500.sql.gz
Would someone help me with creating a script that will loop through all files in the directory and then delete files LEAVING the following:
Keep alternate hour backups older than a day
Keep twice daily backups older than a week
Keep once daily backups older than a month.
I have the following beginnings of the script:
#!/bin/bash
cd /backup_dir
for file in *
do
# do the magic to find out if this files time is up (i.e. needs to be deleted)
# delete the file
done

I have seen many fancy scripts like this for taking scheduled backups and wonder why folks don't make a use of logroate utility available on most of *nix distros available today support following options of your interest:
compress
Old versions of log files are compressed with gzip by default.
dateext
Archive old versions of log files adding a daily extension like YYYYMMDD instead
of simply adding a number.
olddir directory
Logs are moved into directory for rotation. The directory must be on the same
physical device as the log file being rotated, and is assumed to be relative to
the directory holding the log file unless an absolute path name is specified.
When this option is used all old versions of the log end up in directory. This
option may be overriden by the noolddir option.
notifempty
Do not rotate the log if it is empty (this overrides the ifempty option).
postrotate/endscript
The lines between postrotate and endscript (both of which must appear on lines by
themselves) are executed after the log file is rotated. These directives may
only appear inside of a log file definition. See prerotate as well.

You can parse your timestamps by iterating over filenames, or you can use the -cmin flag in the find command (see man 1 find for details).

Related

Empty log files daily using cron task

I want to empty (not delete) log files daily at a particular time. something like
echo "" > /home/user/dir/log/*.log
but it returns
-bash: /home/user/dir/log/*.log: ambiguous redirect
is there any way to achieve this?
You can't redirect to more than one file, but you can tee to multiple files.
tee /home/user/dir/log/*.log </dev/null
The redirect from /dev/null also avoids writing an empty line to the beginning of each file, which was another bug in your attempt. (Perhaps specify nullglob to avoid creating a file with the name *.log if the wildcard doesn't match any existing files, though.)
However, a much better solution is probably to use the utility logrotate which is installed out of the box on every Debian (and thus also Ubuntu, Mint, etc) installation. It runs nightly by default, and can be configured by dropping a file in its configuration directory. It lets you compress the previous version of a log file instead of just overwrite, and takes care to preserve ownership and permissions etc.

How to move a file to cron.d in Linux?

my_cron-file works when it's created directly in /etc/cron.d/:
sudo nano /etc/cron.d/my_cron
# Add content:
* * * * * username /path/to/python /path/to/file 2>/path/to/log
But it doesn't work when I copy/move it to the directory:
sudo cp ./my_cron /etc/cron.d/my_cron
ls -l /etc/cron.d outputs the same permissions both times: -rw-r--r--. The files are owned by root.
The only reason I could imagine at the moment is that I've to refresh/activate something after copying, which happens automatically on creation.
Tested on Ubuntu and Raspbian.
Any idea? Thanks!
Older cron daemons used to examine /etc/cron.d for updated content only when they saw that the last-modified timestamp of that directory, or of the /etc/crontab file, had changed since the last time cron scanned it. Recent cron daemons also examine the timestamps of the individual files in /etc/cron.d but maybe you're dealing with an old one here.
If you have an old cron, then if you copied a brand new file into /etc/cron.d then the directory's timestamp should change and cron should notice the new file.
However, if your cp was merely overwriting an existing file then that would not change the directory timestamp and cron would not pick up the new file content.
Editing a file in-place in /etc/cron.d would not necessarily update the directory timestamp, but some editors (certainly vi, unless you've configured it otherwise) will create temporary working files and perhaps a backup file in the directory where the file being edited lives. The creation and deletion of those other files will cause the directory timestamp to be updated, and that will cause cron to put the edited file into effect. This could explain why editing behaves differently for you than cp'ing does.
To force a timestamp to be updated you could do something like sudo touch /etc/crontab or create and immediately remove a scratch file (or a directory) in /etc/cron.d after you've cp'ed or rm'ed a file in there. Obviously touch is easier. If you want to go the create+delete route then mktemp would be a good tool to use for that, in order to avoid clobbering someone else's legitimate file.
If you were really paranoid, you'd wait at least a second between making file changes and then doing whatever you choose to do to force a timestamp update. That should avoid the situation where a cron rescan, your file updates, and your touch or scratch create+delete could all happen within the granularity of the timestamp.
If you want to see what your cron is actually doing, you can sudo strace -p <pid-of-cron>. Mostly it sleeps for a minute at a time, but you'll see it stat some files and directories (including /etc/crontab and /etc/cron.d) each time it wakes up. And of course if it decides that it needs to run a job, you'll see that activity too.

Logrotate every 30th second and store logfiles in date-named directories

I've got a CentOS installation with a busy webserver.
I need to aquire stats from the log files and keep the old ones, ordered by date.
Every 30th second, the current logfile should be closed and processed (analysing entries and storing them into a database). Since this generates a lot of logfiles, I want to group them into directories, named by date.
At the moment, I have two files; rotation.conf and rotatenow.sh. The shell-file creates directories, based on YmdHMS.
After that, I run the command "logrotate ./rotation.conf -v --force" in order to invoke the proces, but how do I make the config-file put the log into the newly generated directory? Can the whole thing be done inside the config-file?
now="$(date)"
now="$(date +'%Y-%m-%d-%H:%M:%S')"
foldernavn="/var/www/html/stats/logs/nmdstats/closed/$now"
mkdir $foldernavn
logrotate ./nmdhosting.conf -v --force
At the moment, the config-file looks like this:
/var/www/html/stats/logs/nmdhosting/access_log {
ifempty
missingok
(I am stuck)
(do some post-processing - run a Perl-script)
}
Any ideas would be deeply appreciated.
Update: I tried a different approach, adding this to the httpd.conf:
TransferLog "|usr/sbin/rotatelogs /var/www/html/stats/logs/nmdstats/closed/activity_log.%Y%m%d%H%M%S 30".
It works, but apparently, it can't run a pre/post processing script when using this method. This is essential in order to update the database. I could perhaps run a shell/Perl-script using a cronjob, but I don't trust that method. The search goes on...
update 2:
I've also tested cronolog but the - for my project - required functionalities haven't been implemented yet, but are on the to-do. Since the latest version is from 2002, I'm not going to wait around for it to happen :)
However, I was unaware of the inotify-tools, so I managed to set up a listener:
srcdir="/var/www/html/stats/logs/nmdstats/history/"
inotifywait -m -e create $srcdir |
while read filename eventlist eventfile
do
echo "This logfile has just been closed: $eventfile"
done
I think, I can handle it from here. Thank you, John
No need for cron: if you use the TranserLog httpd.conf option to create a new log file every 30 seconds, you can run a post-processing daemon which watches the output directory with inotifywait (or Python's pyinotify, etc.). See here: inotify and bash - this will let you get notified by the OS very soon after a new file is created etc.

Changing file modification time for tar archive members

I have a tar archive on an NTFS drive on a windows machine which contains a folder with files residing on a drive on my linux machine. I try to update the archive from a bash shell script from my linux machine with the -u (--update) tar option, so that only new versions of archive members are appended to the archive. However, due to the "time skew" between file times on two filesystems, tar appends to the archive ALL the files in the folder, even if the folder does not contain any new versions of files at all.
So the problem is: how to add to an archive on machine B only new version of files from a folder on machine A in conditions when there is time skew between machines?
Is there a way to solve this problem so that mtimes of individual files in archive were preserved or changed insignificantly (e.g. adjusted 10 minutes ahead to negate the time skew)? This probably can be accomplished by calling tar individually for appending each file, but is there a more optimal solution?
Maybe there is a way to change mtime individually for each file when it is added to the archive? The option --after-date for appending only files modified after certain date apparently is not quite suitable filter for this task.

Nightly backup and merging backup

I have a nightly back up script that makes a backup from one server of any files that have been modified and thensync them across to our back server.
/var/backups/backup-2011-04-02/backuped/ backuped files and folders
The format above is the nightly incremental backup, which copies all the files and folders to a date stamped folder and then another folder underneath.
Thinking of a script which would run after the back up script to merge all the files in the /var/backups/backup-2011-04-02/backuped/ into /var/www/live/documents
So in theory I need to merge a number of different folders from the backup into the live www on the backup server only with the right date
So whats the best way to go about this script?
You could run rsync on each backup directory to the destination in order of
creation:
$ for f in `ls -t /var/backups`; do rsync -aL "/var/backups/$f" /var/www/live/documents/; done
Of course you can put this line in a nightly cron job. The only thing to look out for is the line above will choke if the filenames in your backup directory have spaces in them, but it looks like they don't, so you may be ok.

Resources