Is there an appender/configuration for log4j or Logback that allows you to write to a GZIP file? - log4j

I'm having issue with logging that is using up too much DiskIO and too much space when a large number of users are using a live system which has issues which only happen in live.
Is there a log4j or (preferably) LogBack appender/configuration that will allow writing directly to a GZIP compressed file?

This feature already exists in Logback. Take a look at appenders section, specifically at time based rolling policy.
Just like FixedWindowRollingPolicy, TimeBasedRollingPolicy supports automatic file compression. This feature is enabled if the value of the fileNamePattern option ends with .gz or .zip.
Also take a look at time and size based rolling policy.
You can setup rollover to occur after one log file hits a certain limit.
I don't believe writing directly to a GZIP compressed file for every log statement would be feasable, since this would create a pretty big performance overhead. Using a combination of existing features sounds reasonable to me.

The space issue is already solved by logback. It will compress your log files during rollover. The IO issue is quite a different one and I am afraid logback does not offer a solution.


will io direction operation lock the file?

i have a growing nginx log file about 20G already, and i wish to rotate it.
1, i mv the old log file to a new log file
2, i do > old_log_file.log to truncate the old log file in about 2~3 seconds
if there's a lock(write lock?) on the old log file when i doing the truncating(about 2~3 seconds)?
at that 2~3s period, nginx returns 502 for waiting to append logs to old log file until lock released?
thank you for explaining.
On Linux, there is (almost) no mandatory file locks (more precisely, there used to be some mandatory locking feature in the kernel, but it is deprecated and you really should avoid using it). File locking happens with flock(2) or lockf(3) and is advisory and should be explicit (e.g. with flock(1) command, or some program calling flock or lockf).
So every locking related to files is practically a convention between all the software using that file (and mv(1) or the redirection by your shell don't use file locking).
Remember that a file on Linux is mostly an i-node (see inode(7)) which could have zero, one or several file paths (see path_resolution(7) and be aware of link(2), rename(2), unlink(2)) and used thru some file descriptor. Read ALP (and perhaps Operating Systems: Three Easy Pieces) for more.
No file locking happens in the scenario of your question (and the i-nodes and file descriptors involved are independent).
Consider using logrotate(8).
Some software provide a way to reload their configuration and re-open log files. You should read the documentation of your nginx.
It depends on application if it locks the file. Application that generates this log file must have option to clear log file. One example is in editor like vim file can be externally modified while it is still open in editor.

Detecting if the code read the specified input file

I am writing some automated tests for testing code and providing feedback to the programmer.
One of the requirements is to detect if the code has successfully read the specified input file. If not - we need to provide feedback to the user accordingly. One way to detect this was atime timestamp, but since our server drive is mounted with relatime option - we are not getting atime updates for every file read. Changing this option to record every atime is not feasible as it slows down our I/O operations significantly.
Is there any other alternative that we can use to detect if the given code indeed reads the specified input file?
Here's a wild idea: intercept read call at some point. One of possible approaches goes more or less like this:
The program makes all its reading through an abstraction. For example, (custom) instead of (stdlib).
During normal operation, MyFileUtils simply delegates the work to File (or whatever system built-in libraries/calls you use).
But under test, MyFileUtils is replaced with a special test version which, along with the delegation, also reports usage to the framework.
Note that in some environments/languages it might be possible to inject code into File directly and the abstraction will not be needed.
I agree with Sergio: touching a file doesn't mean that it was read successfully. If you want to be really "sure"; those programs have to "send" some sort of indication back. And of course, there are many options to get that.
A pragmatic way could be: assuming that those programs under test create log files; your "test monitor" could check that the log files contain fixed entries such as "reading xyz PASSED" or something alike.
If your "code under test" doesn't create log files; maybe: consider changing that.

How do I limit log4j files based on size (only)?

I would like to configure log4j to write only files up to a maximum specified size, e.g. 5MB. When the log file hits 5MB I want it to start writing to a new file. I'd like to put some kind of meaningful timestamp into the logfile name to distinguish one file from the next.
I do not need it to rename or manipulate the old files in any way when a new one is written (compression would be a boon, but is not a must).
I absolutely do not want it to start deleting old files after a certain period of time or number of rolled files.
Timestamped chunks of N MB logfiles seems like the absolute basic minimum simple strategy I can imagine. It's so basic that I almost refuse to believe that it's not provided out of the box, yet I have been incapable of figuring out how to do this!
In particular, I have tried all incarnations of DailyRollingFileAppender, RollingFileAppender or the 'Extras' RollingFileAppender. All of these delete old files when the backupcount is exceeded. I don't want this, I want all my log files!
TimeBasedRollingPolicy from Extras does not fit because it doesn't have a max file size option and you cannot associate a secondary SizeBasedTriggeringPolicy (as it already implements TriggeringPolicy).
Is there an out of the box solution - preferably one that's in maven central?
I gave up trying to get this to work out of the box. It turns out that is, as the young folk apparently say, the bomb.
I tried getting in touch some time back to get the author to stick this into maven central, but no luck so far.

How can one monitor what part of big file changed

Is there solution for Linux kernel-3.0 (or later) that allows one to get notifications similar to inotify describing particular segment of file that was changed?
There was fschange patch for up to kernel-2.6.21. Is there any up to date solution available? Is recent fanotify able to provide the functionality?
Not that I know of, but there is a way to sort of hack the functionality by using the file change notification as an indicator to read the on disk format of the file system an examine the internal file system block allocation tables to learn whats changed.
It's tricky to do, suffers from race conditions and probably a bad idea, but if you must and coding an fschange on top of 3.0 is not an option for you, it might be the way to go.
IMO... forget using inotify unless "the pretty" is important. Other than that, you can setup a cronjob with a script doing a diff or using FIND with the MTIME option.

Should AspBufferLimit ever need to be increased from the default of 4 MB?

A fellow developer recently requested that the AspBufferLimit in IIS 6 be increased from the default value of 4 MB to around 200 MB for streaming larger ZIP files.
Having left the Classic ASP world some time ago, I was scratching my head as to why you'd want to buffer a BinaryWrite and simply suggested setting Response.Buffer = false. But is there any case where you'd really need to make it 50x the default size?
Obviously, memory consumption would be the biggest worry. Are there other concerns with changing this default setting?
Increasing the buffer like that is a supremely bad idea. You would allow every visitor to your site to use up to that amount of ram. If your BinaryWrite/Response.Buffer=false solution doesn't appease him, you could also suggest that he call Response.Flush() now and then. Either would be preferable to increasing the buffer size.
In fact, unless you have a very good reason you shouldn't even pass this through the asp processor. Write it to a special place on disk set aside for such things and redirect there instead.
One of the downsides of turning off the buffer (you could use Flush but I really don't get why you'd do that in this scenario) is that the Client doesn't learn what the Content length at the start of the download. Hence the browsers dialog at the other end is less meaningfull, it can't tell how much progress has been made.
A better (IMO) alternative is to write the desired content to a temporary file (perhaps using GUID for the file name) then sending a Redirect to the client pointing at this temporary file.
There are a number of reasons why this approach is better:-
The client gets good progress info in the save dialog or application receiving the data
Some applications can make good use of byte range fetches which only work well when the server is delivering "static" content.
The temporary file can be re-used to satisify requests from other clients
There are a number of downside though:-
If takes sometime to create the file content, writing to a temporary file can therefore leave some latency before data is received and increasing the download time.
If strong security is needed on the content having a static file lying around may be a concern although the use of a random GUID filename mitigates that somewhat
There is need for some housekeeping on old temporary files.
