CopyTruncate log rotation mechanism is dropping logs

CopyTruncate log rotation mechanism is dropping logs - linux

I had implemented a linux based logrotation with copytruncate strategy. Below is the config for same:
/data/app/info.log {
missingok
copytruncate
maxsize 50M
daily
rotate 30
create 644 app app
delaycompress
compress
}
With above config, whenever logrotation task is triggered with application simultaneously writing logs, some log lines are getting dropped. Can someone please guide what am I doing wrong or suggest any other log rotation strategy with no data loss.

I know this question is a few months old but simply for the benefit of others: You are not doing anything wrong. From the manpage
copytruncate:
Truncate the original log file in place after creating a copy, instead of moving the old log file and optionally creating a new one. It can be used when some program cannot be told to close its logfile and thus might continue writing (appending) to the previous log file forever. Note that there is a very small time slice between copying the file and truncating it, so some logging data might be lost. When this option is used, the create option will have no effect, as the old log file stays in place."

Related

Best Way to Save Sensor Data, Split Every x Megabytes in Python

I'm saving sensor data at 64 samples per second into a csv file. The file is about 150megs at end of 24 hours. It takes a bit longer than I'd like to process it and I need to do some processing in real time.
value = str(milivolts)
logFile.write(str(datet) + ',' + value + "\n")
So I end up with single lines with date and milivolts up to 150 megs. At end of 24 hours it makes a new file and starts saving to it.
I'd like to know if there is a better way to do this. I have searched but can't find any good information on a compression to use while saving sensor data. Is there a way to compress while streaming / saving? What format is best for this?
While saving the sensor data, is there an easy way to split it into x megabyte files without data gaps?
Thanks for any input.

I'd like to know if there is a better way to do this.
One of the simplest ways is to use a logging framework, it will allow you to configure what compressor to use (if any), the approximate size of a file and when to rotate logs. You could start with this question. Try experimenting with several different compressors to see if speed/size is OK for your app.
While saving the sensor data, is there an easy way to split it into x megabyte files without data gaps?
A logging framework would do this for you based on the configuration. You could combine several different options: have fixed-size logs and rotate at least once a day, for example.
Generally, this is accurate up to the size of a logged line, so if the data is split into lines of reasonable size, this makes life super easy. One line ends in one file, another is being written into a new file.
Files also rotate, so you can have order of the data encoded in the file names:
raw_data_<date>.gz
raw_data_<date>.gz.1
raw_data_<date>.gz.2
In the meta code it will look like this:
# Parse where to save data, should we compress data,
# what's the log pattern, how to rotate logs etc
loadLogConfig(...)
# any compression, rotation, flushing etc happens here
# but we don't care, and just write to file
logger.trace(data)
# on shutdown, save any temporary buffer to the files
logger.flush()

Reduce Service Fabric backup size

I'm trying to use Service Fabric backups with Actors:
var backupDescription = new BackupDescription(BackupOption.Full, BackupCallbackAsync);
await BackupAsync(backupDescription, TimeSpan.FromHours(1), cancellationToken);
But I've noticed that one backup file may contains several files like:
edb0000036A.log 5120 KB
edb0000036B.log 5120 KB
edb00000366.log 5120 KB
...
I haven't found any info about these files but it seems that they are just logs and I may not include them. Am I right or these files must be included in backup?
These files are quite heavy so I'm trying to reduce size of backups
UPDATE 1:
I have tried to use incremental backup. But it seems that Actors do not support Incremental backup as I have read on MSDN. Moreover I have tested but got Exception "Invalid backup option. Parameter name: option"

Instead of doing full backups every hour, you can also use incremental backups, which will result in a smaller size. (For example, do a full backup every day, and incrementals every hour for instance)
The log files are transaction logs, they are not optional for restore. More info here.

How to prevent eventLog file of Spark stream jobs eating up space?

We have multiple run-forever streaming jobs generating huge eventLogs. These in-progress logs won't be removed until reach the the max age config (spark.history.fs.cleaner.maxAge).
Based on the Spark source code, "Only completed applications older than the specified max age will be deleted." https://github.com/apache/spark/blob/a45647746d1efb90cb8bc142c2ef110a0db9bc9f/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
So, in-progress eventLog will never be removed before completion and they are eating up space. Anyone have idea how to prevent it?
We have option like script periodically removes old files, but it will be our last resort, and we cannot modify the source code, but configuration.

Storing run time logs in a folder

I am running a shell script in Linux environment to create some logs (dynamic log files) as text files.
I want to store all the log files created into a single folder after some particular time.
So how can I do that? Can anyone suggest some commands?
Thanks in advance.

In the script you can define that directory as a variable and you can use that one across the script.
#!/bin/bash
LOG_DIR=/tmp/logs
LOG_FILE=$LOG_DIR/log_file.$$ ## $$ will create the different log file for each and every run
## You can also do it by using some time stamp using date command.
<Your commands> >> $LOG_FILE

It really depends on your situations:
[Suggested if your log files are small in size]
You may want to backup your logs by just add a cron job, and zip/tar it to another folder, as a snapshot. Since the log files are small, even zip/tar everything may need to take you many many years to fill-up your hard drive.
[Suggested if your log files are large]
In your script that generate logs, you may want to rotate through a few indexed files, say, log.0 to log.6, each for one week day, from Sunday to Saturday. And you can have another script to backup yesterday's log (so that it won't have race conditions between the log producer and the log consumer, i.e. the log mover/copier). You can have strategies for how many days of backup will be still existing, and for how long of those should be discarded.
The yesterdays' log mover/copier can be easily done by a cron job.

Centos/Linux setting logrotate to maximum file size for all logs

we use logrotate and it runs daily ... now we have had some situations where logs have grown significantly (read: gigbaytes) and killing our server. So now we would like to set a maximum filesize to the logs ....
can I just add this to the logrotate.conf?
size 50M
and would it then apply to all log files? Or do I need to set this on a per log basis?
Or any other advice?
(ps. I understand that if you want to be notified is the log grows like described and what we want to do is not ideal - but it is better than not being able to logon anymore because there is no space available)
thanks, Sean

As mentioned by Zeeshan, the logrotate options size, minsize, maxsize are triggers for rotation.
To better explain it. You can run logrotate as often as you like, but unless a threshold is reached such as the filesize being reached or the appropriate time passed, the logs will not be rotated.
The size options do not ensure that your rotated logs are also of the specified size. To get them to be close to the specified size you need to call the logrotate program sufficiently often. This is critical.
For log files that build up very quickly (e.g. in the hundreds of MB a day), unless you want them to be very large you will need to ensure logrotate is called often! this is critical.
Therefore to stop your disk filling up with multi-gigabyte log files you need to ensure logrotate is called often enough, otherwise the log rotation will not work as well as you want.
On Ubuntu, you can easily switch to hourly rotation by moving the script /etc/cron.daily/logrotate to /etc/cron.hourly/logrotate
Or add
*/5 * * * * /etc/cron.daily/logrotate
To your /etc/crontab file. To run it every 5 minutes.
The size option ignores the daily, weekly, monthly time options. But minsize & maxsize take it into account.
The man page is a little confusing there. Here's my explanation.
minsize rotates only when the file has reached an appropriate size and the set time period has passed. e.g. minsize 50MB + daily
If file reaches 50MB before daily time ticked over, it'll keep growing until the next day.
maxsize will rotate when the log reaches a set size or the appropriate time has passed.
e.g. maxsize 50MB + daily.
If file is 50MB and we're not at the next day yet, the log will be rotated. If the file is only 20MB and we roll over to the next day then the file will be rotated.
size will rotate when the log > size. Regardless of whether hourly/daily/weekly/monthly is specified. So if you have size 100M - it means when your log file is > 100M the log will be rotated if logrotate is run when this condition is true. Once it's rotated, the main log will be 0, and a subsequent run will do nothing.
So in the op's case. Specficially 50MB max I'd use something like the following:
/var/log/logpath/*.log {
maxsize 50M
hourly
missingok
rotate 8
compress
notifempty
nocreate
}
Which means he'd create 8hrs of logs max. And there would be 8 of them at no more than 50MB each. Since he's saying that he's getting multi gigabytes each day and assuming they build up at a fairly constant rate, and maxsize is used he'll end up with around close to the max reached for each file. So they will be likely close to 50MB each. Given the volume they build, he would need to ensure that logrotate is run often enough to meet the target size.
Since I've put hourly there, we'd need logrotate to be run a minimum of every hour. But since they build up to say 2 gigabytes per day and we want 50MB... assuming a constant rate that's 83MB per hour. So you can imagine if we run logrotate every hour, despite setting maxsize to 50 we'll end up with 83MB log's in that case. So in this instance set the running to every 20 minutes or less should be sufficient.
Ensure logrotate is run every 20 mins.
In this case, in 20mins the log file might be < 50MB, but in 40min, over 50MB in which case it will rotate. If we rotate every 30mins, in 30mins it's < 50MB (no rotation), in 1hr it may have built to > 80MB already in which case the file will be rotated but much bigger than we were expecting. So make sure you take rotation time into account.
*/20 * * * * /etc/cron.daily/logrotate

It specifies the size of the log file to trigger rotation. For example size 50M will trigger a log rotation once the file is 50MB or greater in size. You can use the suffix M for megabytes, k for kilobytes, and G for gigabytes. If no suffix is used, it will take it to mean bytes. You can check the example at the end. There are three directives available size, maxsize, and minsize. According to manpage:
minsize size
Log files are rotated when they grow bigger than size bytes,
but not before the additionally specified time interval (daily,
weekly, monthly, or yearly). The related size option is simi-
lar except that it is mutually exclusive with the time interval
options, and it causes log files to be rotated without regard
for the last rotation time. When minsize is used, both the
size and timestamp of a log file are considered.
size size
Log files are rotated only if they grow bigger then size bytes.
If size is followed by k, the size is assumed to be in kilo-
bytes. If the M is used, the size is in megabytes, and if G is
used, the size is in gigabytes. So size 100, size 100k, size
100M and size 100G are all valid.
maxsize size
Log files are rotated when they grow bigger than size bytes even before
the additionally specified time interval (daily, weekly, monthly,
or yearly). The related size option is similar except that it
is mutually exclusive with the time interval options, and it causes
log files to be rotated without regard for the last rotation time.
When maxsize is used, both the size and timestamp of a log file are
considered.
Here is an example:
"/var/log/httpd/access.log" /var/log/httpd/error.log {
rotate 5
mail www#my.org
size 100k
sharedscripts
postrotate
/usr/bin/killall -HUP httpd
endscript
}
Here is an explanation for both files /var/log/httpd/access.log and /var/log/httpd/error.log. They are rotated whenever it grows over 100k in size, and the old logs files are mailed (uncompressed) to www#my.org after going through 5 rotations, rather than being removed. The sharedscripts means that the postrotate script will only be run once (after the old logs have been compressed), not once for each log which is rotated. Note that the double quotes around the first filename at the beginning of this section allows logrotate to rotate logs with spaces in the name. Normal shell quoting rules apply, with ,, and \ characters supported.

To simplify the explanation further:
Logrotate size parameter is only applied when logrotate runs.
So for example, if you set your logrotate to run every hour and when size reaches 5MB. If the file reaches over 5MB before an hour is reached - the file will in effect grow to be bigger than 5MB because logrotate was never called on the file.
It is imperative that logrotate is run on the file frequently enough to check its size. Therefore when using the size parameter in logrotate, just let the timing of the logrotate be handled by something else. (i.e. cron/script). This means you can omit specifying time in your logrotate config.
For example if I want to rotate a file at size 5MB - how quick that file reaches that size will determine how often logrotate should run. Suppose it takes about 10 minutes on average to get to 5MB, firstly the rotate settings at minimum has:
/var/log/path/the.log {
rotate 1 (#number of rotations)
size 5M
}
create directory /etc/custom-rotate.d
Save the above in /etc/custom-rotate.d/customlog
Permissions: sudo chmod 644 /etc/custom-rotate.d/customlog
Create a config file:
cat << EOF | sudo tee /etc/custom-rotate.conf
# packages drop custom log rotation information into this directory
include /etc/custom-rotate.d
EOF
Permission: sudo chmod 644 /etc/custom-rotate.conf
Then have a cron to run every, say 5 minutes (giving space for possible anomalies) to check the size.
sudo su
crontab -e
Add entry:
*/5 * * * * /usr/sbin/logrotate /etc/custom-rotate.d/customlog
Reload cron:
sudo service cron reload
Therefore every 5 minutes logrotate will run and if the size is greater than 5M it will rotate the logs.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string