Check gzip is ok within a cron job - cron

I've started to get some problems with corrupt gzip files. I'm not sure exactly when it happens and the colleague how set our storage have quit etc. so I'm not an expert in cron jobs etc. but this is how it looks today:
/var/spool/new_files/*.csv
{
daily
rotate 12
missingok
notifempty
delaycompress
compress
sharedscripts
postrotate
service capture_data restart >/dev/null 2>&1 || true
endscript
}
In principle at midnight the script restart all csv files in var/spool/new_files/, change the name to them (increment them by 1) and gzip the one which is then named "2" and moves that to our long time storage.
I don't know if the files are corrupt just after they have been gzip or if this happens during the "transfer" to the storage. If I run zcat file_name | tail I get an invalid compressed data--length error. This error happens randomly 1-3 times per month.
So the first thing I want to do is to run gzip -k and keep the original,
Check if the files are corrupt after they have been gziped
Retry once
If this also fails add an error in logs
Stop the cron job
If the gzip file is ok after creation move it to long time storage
Test if they are ok, if not:
Retry once
If this also fails add an error in logs
Stop the cron job
Throw away the original file
Does the logic that I suggest make sense? Any suggestions how to add it to the cron job?

It seems ok... for the integrity check you could see something here How to check if a Unix .tar.gz file is a valid file without uncompressing?
You can make a .sh where you add all the commands you need and after that you can add the script .sh in the crontab in:
/var/spool/cron/
if you want to run .sh script with root just add or modify /var/spool/cron/root file... in a similar way you can add cron runned by other users.
the cron would be something like:
0 0 * * * sh <path to your sh>

Related

how does logrotate actually works

I am trying to the setup log for the httpd server then I write a script in /etc/logrotate.d/apache.conf
/var/log/httpd/* {
daily
rotate 3
size 20K
compress
delaycompress
}
what I understood from this file
/var/log/httpd/* = where all the logs the stored and you want to logrotate them
daily = when you want to rotate the log
rotate= only 3 rotated logs should be kept
size = when your log file size meet with this condition
compress= make a zip file of rotated logs
delaycompress = kind of compress don't know much
so I hit to apache server that generates a lot of logs
after the log generated
where is my log are store how it is run only on when size condition matches or else
thanks for any guidance or help
one more thing when and how log rotate run why some people suggested to use cron job with logrotate
where is my log are store
Unless you specify the olddir directive, they are rotated within the same directory that they exist.
how it is run only on when size condition matches or else
If you specify the size directive, logs are only rotated if they are larger than that size:
size size
Log files are rotated only if they grow bigger then size bytes.
Files that do not meet the size requirement are ignored (https://linux.die.net/man/8/logrotate).
why some people suggested to use cron job with logrotate
logrotate is just an executable; it does not handle any facet of how or when it is executed. cron is typically how logrotate's execution is scheduled. For example, on CentOS, inside the /etc/cron.daily directory is an executable shell script:
#!/bin/sh
/usr/sbin/logrotate -s /var/lib/logrotate/logrotate.status /etc/logrotate.conf
EXITVALUE=$?
if [ $EXITVALUE != 0 ]; then
/usr/bin/logger -t logrotate "ALERT exited abnormally with [$EXITVALUE]"
fi
exit 0
This script is executed one per day by cron and is how logrotate's execution is actually initiated.
Couple other problems:
/var/log/httpd/* - Unless you're rotating files out of the original directory with the olddir directive, never end your logrotate's directory definition with a wildcard (*). This definition is going to glom on to every file in that directory, including the files that you've already rotated. Logrotate has no way of keeping track of what files are actual logs and which are stored rotations. Your directory definition should be something like /var/log/httpd/*_log instead.
You should be reloading httpd after you rotate the log files, else it will probably continue to log into the rotated file because you never closed its handle.
sharedscripts
postrotate
/bin/systemctl reload httpd.service > /dev/null 2>/dev/null || true
endscript
Again this is a CentOS-specific example.

Wipe clean logfile on daily basis with logrotate

I want to clean a Docker container logs every day (no need to store/archive the data). I created a file called docker in /etc/logrotate.d and put the following inside:
/var/lib/docker/containers/*/*.log {
rotate 0 #do not keep archives
daily
missingok
copytruncate #continue working in the same log file
}
well, but it doesn't work. So, obviously something in my logrotate configuration isn't right. AFAIK, I don't need to setup crontab for this. Is my configuration wrong? Is there anything that I am missing?
Is there a way to run and test logrotate without having to wait a day?
rotate 0 is redundant here, 0 is the default for rotation, which means, no rotation and old ones are removed.
You can test it using,
$ logrotate -f /etc/logrotate.d/docker
It might be better to test it by putting rotate 1 and then with rotate 0 to check if logrotate is functioning correctly.
If logrotate is installed correctly, then one would have a crontab entry in /etc/cron.daily/logrotate. This file reads the configuration file /etc/logrotate.conf which includes anything under /etc/logrotate.d
HTH

Zip the log file after cron finishes to save disk space - Ubuntu

I have this command to run a cron and create a log file out of it
cd /root/amazon-crawler/ && python batchscript.py >> `date +%Y%m%d%H%M%S`cronlog.log 2>&
Actually I am running this cron twice a day and each log file has 400mb to 700mb size.
As you can see every time a new file is created because I don't want to miss/delete older log files, though I can manually delete files a week older.
Is there any way you can specify to Zip the log file after a cron is finished.
Better still, use logrotate. It can automatically:
rename logfiles
compress them
discard old logfiles

Keep files updated from remote server

I have a server at hostname.com/files. Whenever a file has been uploaded I want to download it.
I was thinking of creating a script that constantly checked the files directory. It would check the timestamp of the files on the server and download them based on that.
Is it possible to check the files timestamp using a bash script? Are there better ways of doing this?
I could just download all the files in the server every 1 hour. Would it therefore be better to use a cron job?
If you have a regular interval at which you'd like to update your files, yes, a cron job is probably your best bet. Just write a script that does the checking and run that at an hourly interval.
As #Barmar commented above, rsync could be another option. Put something like this in the crontab and you should be set:
# min hour day month day-of-week user command
17 * * * * user rsync -av http://hostname.com/ >> rsync.log
would grab files from the server in that location and append the details to rsync.log on the 17th minute of every hour. Right now, though, I can't seem to get rsync to get files from a webserver.
Another option using wget is:
wget -Nrb -np -o wget.log http://hostname.com/
where -N re-downloads only files newer than the timestamp on the local version, -b sends
the process to the background, -r recurses into directories and -o specifies a log file. This works from an arbitrary web server. -np makes sure it doesn't go up into a parent directory, effectively spidering the entire server's content.
More details, as usual, will be in the man pages of rsync or wget.

logrotate compress files after the postrotate script

I have an application generating a really heavy big log file every days (~800MB a day), thus I need to compress them but since the compression takes time, I want that logrotate compress the file after reloading/sending HUP signal to the application.
/var/log/myapp.log {
rotate 7
size 500M
compress
weekly
postrotate
/bin/kill -HUP `cat /var/run/myapp.pid 2>/dev/null` 2>/dev/null || true
endscript
}
Is it already the case that the compression takes place after the postrotate (which would be counter-intuitive)?
If not Can anyone tell me if it's possible to do that without an extra command script (an option or some trick)?
Thanks
Thomas
Adding this info here in case of anyone else that comes across this thread when actually searching for wanting a way to run a script on a file once compression has completed.
As suggested above using postrotate/endscript is no good for that.
Instead you can use lastaction/endscript, which does the job perfectly.
The postrotate script always runs before compression even when sharedscripts is in effect. Hasturkun's additional response to the first answer is therefore incorrect. When sharedscripts is in effect the only compression performed before the postrotate is for old uncompressed logs left lying around because of a delaycompress. For the current logs, compression is always performed after running the postrotate script.
The postrotate script does run before compression occurs: from the man page for logrotate
The next section of the config files defined how to handle the log file
/var/log/messages. The log will go through five weekly rotations before
being removed. After the log file has been rotated (but before the old
version of the log has been compressed), the command /sbin/killall -HUP
syslogd will be executed.
In any case, you can use the delaycompress option to defer compression to the next rotation.
#Hasturkun - One cannot add a comment unless their reputation is first above 50.
To make sure of what logrotate will do, either
test your configuration with, -d: debug which tests but does not do
anything, and -f: force it to run
or you can execute logrotate with
the -v verbose flag
With a configuration that uses a sharedscript for postrotate
$ logrotate -d -f <logrotate.conf file>
Shows the following steps:
rotating pattern: /tmp/log/messages /tmp/log/maillog /tmp/log/cron
...
renaming /tmp/log/messages to /tmp/log/messages.1
renaming /tmp/log/maillog to /tmp/log/maillog.1
renaming /tmp/log/cron to /tmp/log/cron.1
running postrotate script
<kill-hup-script executed here>
compressing log with: /bin/gzip
compressing log with: /bin/gzip
compressing log with: /bin/gzip

Resources