RollingFileAppender - rolling yearly?

RollingFileAppender - rolling yearly? - log4net

Is it possible to set the RollingFileAppender to roll yearly rather than monthly? There won't be enough entries to require one file per month, so I'd like to set it up on a yearly basis, but when I set the datePattern = "yyyy" log4net said it was unable to parse (Invalid Roll Point).

No.
RollingFileAppender.RollPoint.TopOfMonth is the biggest value.
See http://logging.apache.org/log4net/release/sdk/log4net.Appender.RollingFileAppender.RollPoint.html for reference.

Related

Opensearch - best practice for indexing

I have ~1 TB of old apache log data I would like to index in Opensearch. Logs are per day and structured like: s3://bucket/logdata/year/year_month_day.json.gz
I plan to use logstash for the ingest and wonder the best way to index(es) to get performance?
I would like to index per day but how do extract the date from the logfile name above to get it right in the logstash conf file?
index = > "%{+YYYY.MM.dd}" will solve the future logfiles but how do I solve it for the old ones?

You can do it like this using the dissect filter that can parse the date components from the bucket key and reconstruct the date into a new field called log_date:
dissect {
mapping => {
"[#metadata][s3][key]" => "%{ignore}/logdata/%{+ignore}/%{year}_%{+month}_%{day}.json.gz"
}
add_field => {
"log_date" => "%{year}-%{month}-%{day}"
}
remove_field => ["ignore"]
}
Then in your output section you can reference that new field in order to build your index name:
index = > "your-index-%{log_date}"
PS: another way is to parse the year_month_day part as one token and replace the _ characters with - using mutate/gsub

In my experience, daily indices can quickly run out of control: they vary in size greatly, with a decent retention period cluster might get oversharded, etc. I would recommend to set up ILM rollover with policy based on both index age (7 or 30 days, depending on logging volume) and primary shard size (common threshold is 50GB). You can also set up a delete phase as well in the same policy, based on your retention period.
This way you'll get optimal indexing and search performance, as well as uniform load distribution and resource usage.

Dealing with a daily time window across timezones in Node.js

Currently, I'm working on a project that requires a window of time to be selected that is used as a valid window to trigger an event within. This window is selected by the user as a start time (24 hour time), end time (24 hour time), and a timezone. My goal is to then be able to convert these times into UTC based on the offset from the provided timezone and save into MySQL.
The main problem is I have set up the entire flow to deal with time-only data types from the mobile app all the way back to the MySQL database. I have been trying to figure out a solution that won't require changing all those data types to include date and time which would require changes in many parts of the project.
Can I make this calculation without dealing with the date? I don't believe I can as timezone offsets range from -12:00 to +14:00 which would push some windows to the next or previous days when turned into UTC.
Is the correct approach to add in the date component and then continue to update it as time progresses? I also want to ensure daylight savings doesn't create errors.
Ultimately I would like the best approach to take so if I have to change a lot now I'd rather do that then deal with a headache later. Any thoughts would be greatly appreciated!

Liferay: huge DLFileRank table

I have a Liferay 6.2 server that has been running for years and is starting to take a lot of database space, despite limited actual content.
Table Size Number of rows
--------------------------------------
DLFileRank 5 GB 16 million
DLFileEntry 90 MB 60,000
JournalArticle 2 GB 100,000
The size of the DLFileRank table sounds to me as abnormally big (if it is totally normal please let me know).
While the file ranking feature of Liferay is nice to have, we would not really mind resetting it if it halves the size of the database.
Question: Would a DELETE * FROM DLFileRank be safe? (stop Liferay, run that SQL command, maybe set dl.file.rank.enabled=false in portal-ext.properties, start Liferay again)
Is there any better way to do it?
Bonus if there is a way to keep recent ranking data and throw away only the old data (not a strong requirement).

Wow. According to the documentation here (Ctrl-F rank), I'd not have expected the number of entries to be so high - did you configure those values differently?
Set the interval in minutes on how often CheckFileRankMessageListener
will run to check for and remove file ranks in excess of the maximum
number of file ranks to maintain per user per file. Defaults:
dl.file.rank.check.interval=15
Set this to true to enable file rank for document library files.
Defaults:
dl.file.rank.enabled=true
Set the maximum number of file ranks to maintain per user per file.
Defaults:
dl.file.rank.max.size=5
And according to the implementation of CheckFileRankMessageListener, it should be enough to just trigger DLFileRankLocalServiceUtil.checkFileRanks() yourself (e.g. through the scripting console). Why you accumulate that large number of files is beyond me...
As you might know, I can never be quoted by stating that direct database manipulation is the way to go - in fact I refuse thinking about the problem from that way.

Azure : Resource usage API issue

I tried to pull the Azure resource usage data for billing metrics. I followed the steps as mentioned in the blog to get Usage data of resources.
https://msdn.microsoft.com/en-us/library/azure/mt219001.aspx
Even If I set "start and endtime" parameter in the URL, its not take effect. It returns entire output [ from resource created/added time ].
For example :
https://management.azure.com/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/providers/Microsoft.Commerce/UsageAggregates?api-version=2015-06-01-preview&reportedStartTime=2017-03-03T00%3a00%3a00%2b00%3a00&reportedEndTime=2017-03-04T00%3a00%3a00%2b00%3a00&aggregationGranularity=Hourly&showDetails=true"
As per the above URL, it should return the data between "2017-03-03 to 2017-03-04". But It shows the data from 2nd March [ 2017-03-02]. don't know why this return entire output and time filter section is not working.
Note : Endtime parameter value takes effect, mean it shows the output upto what mentioned in the endtime. But it doesn't consider the start time.
Anyone have a suggestion on this.

So there are a few things to consider:
There is usage date/time and then there is reported date/time.
Former tells you the date/time when the resources were used while the
latter tells you the date/time when this information was received by
the billing sub-system. There will be some delay in when the
resources used versus when they are reported. From this link:
Set {dateTimeOffset-value} for reportedStartTime and reportedEndTime
to valid dateTime values. Please note that this dateTimeOffset value
represents the timestamp at which the resource usage was recorded
within the Azure billing system. As Azure is a distributed system,
spanning across 19 datacenters around the world, there is bound to be
a delay between the resource usage time (when the resource was
actually consumed) and the resource usage reported time (when the
usage event reached the billing system) and callers need a predictable
way to get all usage events for a subscription for a given time
period.
The query only lets you search for reported date/time and there is no provision for usage date/time. However the data returned back to you contains usage date/time and not the reported date/time.
Long story short, because of the delay in propagating the usage information to the billing sub-system, the behavior you're seeing is correct. In my experience, it takes about 24 hours for all the usage information to show up in the billing sub-system.
The way we handle this scenario in our application is we fetch the data for a longer duration and then pick up only the data we're interested in seeing. So for example, if I need to see the data for 1st of March then we query the data for reported date/time from 1st March to say 4th March (i.e. today's date) and then discard any data where usage date is not 1st of March.
If we don't find any data (which is quite possible and is happening in your case as well), we simply tell the users that usage information is not yet available.

What is the earliest timestamp value that is supported in ZIP file format?

I am trying to store dates as latest modification timestamp in a ZIP -file. It seems that ZIP format support only dates after 1980-01-01 as a last modification time (at least via Java API java.util.zip.ZipEntry )
Is this correct? Is the earliest supported modification timestamp really 1980-01-01 00:00:00? I tried to find some references to verify this but I couldn't find any.

Zip entry timestamps are recorded only
to two 2 second precision. This
reflects the accuracy of DOS
timestamps in use when PKZIP was
created. That number recorded in the
Zip will be the timestamp truncated,
not the nearest 2 seconds.
When you archive and restore a file,
it will no longer have a timestamp
precisely matching the original. This
is above and beyond he similar problem
with Java using 1 millisecond
precision and Microsoft Windows using
100 nanosecond increments. PKZIP
format derives from MS DOS days and
hence uses only 16 bits for time and
16 bits for date. There is defined an
extended time stamp in the revised
PKZIP format, but Java does not use
it.
Inside zip files, dates and times are
stored in local time in 16 bits, not
UTC as is conventional, using an
ancient MS DOS format. Bit 0 is the
least signifiant bit. The format is
little-endian. There was not room in
16 bit to accurately represent time
even to the second, so the seconds
field contains the seconds divided by
two, giving accuracy only to the even
second.
This means the apparent time of files
inside a zip will suddenly differ by
an hour compared with their
uncompressed counterparts every time
you have a daylight saving change. It
also means that the a zip utility will
extract a different UTC time from a
Zip member date depending on which
timezone the calculation was done.
This is ridiculous. PKZIP format needs
a modern UTC-based timestamp to avoid
these anomalies.
To make matters worse, Standard tools
like WinZip or PKZIP will always round
the time up to the next even second
when they restore, thereby possibly
making the file one second to two
seconds younger. The JDK (i.e.
javaToDosTime in ZipEntry rounds the
time down, thereby making the file one
to two seconds older.
The format does not support dates
prior to 1980-01-01 0:00 UTC. Avoid
file dates 1980-01-01 or earlier
(local or UTC time).
Wait! It gets even worse. Phil Katz,
when he documented the Zip format, did
not bother to specify whether the
local time used in the archive should
be daylight or standard time.
And to cap it off… Info-ZIP, JSE and
TrueZIP apply the DST schedule (days
where DST began and ended in any given
year) for any date when converting
times between system time and DOS
date/time. This is as it should be.
Vista’s Explorer, 7-Zip and WinZip
apply only the DST savings, but do not
apply the schedule. So they use the
current DST savings for any date when
converting times between system time
and DOS date/time. This is just
sloppy.
http://mindprod.com/jgloss/zip.html
tar files are so much better.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string