Log4j in file Spark - apache-spark

I am trying to log my project. For that, I'm using log4j and I'm putting the information and settings in the code itself, without using the properties file, as shown below.
public class Teste {
Logger log = Logger.getLogger(Teste.class.getName());
public static void configError() {
EnhancedPatternLayout layout = new EnhancedPatternLayout();
String conversionPattern = "%d{ISO8601}{GMT+1} %-5p %m%n";
layout.setConversionPattern(conversionPattern);
String fileError = "C:/ProducerError.log";
// creates console appender
ConsoleAppender consoleAppender = new ConsoleAppender();
consoleAppender.setLayout(layout);
consoleAppender.activateOptions();
// creates file appender
FileAppender fileAppender = new FileAppender();
fileAppender.setFile(fileError);
fileAppender.setLayout(layout);
fileAppender.activateOptions();
// configures the root logger
Logger rootLogger = Logger.getRootLogger();
rootLogger.setLevel(Level.ERROR);
rootLogger.addAppender(consoleAppender);
rootLogger.addAppender(fileAppender);
log.error("Error teste");
rootLogger.removeAllAppenders();
}
}
I wanted to do the same but in a spark file. I tried the same way but it doesn't return anything. How does spark logs work? Can't I put it in the code like I did before? I have a DockerFile with spark-submit, but I didn't want to mess with that code.

Provide config file path while submitting the spark job -Dlog4j.configuration=path/to/log4j.properties

Related

Add Sentry Log4j2 appender at runtime

I've been browsing previous threads about adding Log4j2 appenders at runtime but none of them really seem to fit my scenario.
We have a longrunning Flink Job packaged into a FAT jar that we essentially submit to a running Flink cluster. We want to forward error logs to Sentry. Conveniently enough Sentry provides a Log4j2 appender that I want to be able to use, but all attempts to get Log4j2 to work have failed -- going a bit crazy about this (spent days).
Since Flink (who also uses log4j2) provides a set of default logging configurations that takes precedence of any configuration files we bundle in our jar; I'm essentially left with attempting to configure the appender at runtime to see if that will make it register the appender and forward the LogEvents to it.
As a side note: I attempted to override the Flink provided configuration file (to essentially add the appender directly into the Log4j2.properties file) but Flink fails to load the plugin due to a missing dependency - io.sentry.IHub - which doesn't make sense since all examples/sentry docs don't mention any other dependencies outside of log4j related ones which already exists in the classpath.
I've followed the example in the log4j docs: Programmatically Modifying the Current Configuration after Initialization but the logs are not getting through to Sentry.
SentryLog4j.scala
package com.REDACTED.thoros.config
import io.sentry.log4j2.SentryAppender
import org.apache.logging.log4j.Level
import org.apache.logging.log4j.LogManager
import org.apache.logging.log4j.core.LoggerContext
import org.apache.logging.log4j.core.config.AppenderRef
import org.apache.logging.log4j.core.config.Configuration
import org.apache.logging.log4j.core.config.LoggerConfig
object SentryLog4j2 {
val SENTRY_LOGGER_NAME = "Sentry"
val SENTRY_BREADCRUMBS_LEVEL: Level = Level.ALL
val SENTRY_MINIMUM_EVENT_LEVEL: Level = Level.ERROR
val SENTRY_DSN =
"REDACTED"
def init(): Unit = {
// scalafix:off
val loggerContext: LoggerContext =
LogManager.getContext(false).asInstanceOf[LoggerContext]
val configuration: Configuration = loggerContext.getConfiguration
val sentryAppender: SentryAppender = SentryAppender.createAppender(
SENTRY_LOGGER_NAME,
SENTRY_BREADCRUMBS_LEVEL,
SENTRY_MINIMUM_EVENT_LEVEL,
SENTRY_DSN,
false,
null
)
sentryAppender.start()
configuration.addAppender(sentryAppender)
// Creating a new dedicated logger for Sentry
val ref: AppenderRef =
AppenderRef.createAppenderRef("Sentry", null, null)
val refs: Array[AppenderRef] = Array(ref)
val loggerConfig: LoggerConfig = LoggerConfig.createLogger(
false,
Level.ERROR,
"org.apache.logging.log4j",
"true",
refs,
null,
configuration,
null
)
loggerConfig.addAppender(sentryAppender, null, null)
configuration.addLogger("org.apache.logging.log4j", loggerConfig)
println(configuration.getAppenders)
loggerContext.updateLoggers()
// scalafix:on
}
}
Then invoking the SentryLog4j.init() in the Main module.
import org.apache.logging.log4j.LogManager
import org.apache.logging.log4j.Logger
import org.apache.logging.log4j.core.LoggerContext
import org.apache.logging.log4j.core.config.Configuration
object Main {
val logger: Logger = LogManager.getLogger()
sys.env.get("ENVIRONMENT") match {
case Some("dev") | Some("staging") | Some("production") =>
SentryLog4j2.init()
case _ => SentryLog4j2.init() // <-- this was only added during debugging
}
def main(args: Array[String]): Unit = {
logger.error("test") // this does not forward the logevent to the appender
}
}
I think I somehow need to register the appender to loggerConfig that the rootLogger uses so that all logger.error statements are propogated to the configured Sentry appender?
Greatly appreciate any guidance with this!
Although not an answer to how you get log4j2 and the SentryAppender to work. For anyone else that is stumbling on this problem, I'll just briefly explain what I did to get the sentry integration working.
What I eventually decided to do was drop the use of the SentryAppender and instead used the raw sentry client. Adding a wrapper class exposing the typical debug, info, warn and error methods. Then for the warn+ methods, I'd also send the logevent to Sentry.
This is essentially the only way I got this to work within a Flink cluster.
See example below:
sealed trait LoggerLike {
type LoggerFn = (String, Option[Object]) => Unit
val debug: LoggerFn
val info: LoggerFn
val warn: LoggerFn
val error: LoggerFn
}
trait LazyLogging {
#transient
protected lazy val logger: CustomLogger =
CustomLogger.getLogger(getClass.getName, enableSentry = true)
}
final class CustomLogger(slf4JLogger: Logger) extends LoggerLike {...your implementation...}
Then for each class/object (scala language at least), you'd just extend the LazyLogging trait to get a logger instance.

Serilog MinimumLevel Override with AspNetCore

Serilog with ASP NET 5 Razor Pages.
Reducing log verbosity is very useful for Informational logs.
However for debug logs, how to get a MinimumLevel.Override("Microsoft.AspNetCore") to be specific to a debug file sink?
Creating 2 configurations could be a solution, but feels like something more elegant may be possible?
Log.Logger = new LoggerConfiguration()
.MinimumLevel.Debug()
.Enrich.FromLogContext()
// for debug file sink I want the override to be Debug
.MinimumLevel.Override("Microsoft.AspNetCore", LogEventLevel.Debug)
.WriteTo.File("debug.txt", restrictedToMinimumLevel: LogEventLevel.Debug)
// for info and warning file sinks I want the override to be Warning
.MinimumLevel.Override("Microsoft.AspNetCore", LogEventLevel.Warning)
.WriteTo.File("info.txt", restrictedToMinimumLevel: LogEventLevel.Information)
.WriteTo.File("warning.txt", restrictedToMinimumLevel: LogEventLevel.Warning)
.CreateLogger();
Everything works as expected using just one override. But not together.
In the example above the Warning override takes precedence and no AspNetCore Debug event logs are written to debug.txt
Edit
In summary, I'd like my debug log to include Information event level from Microsoft.AspNetCore and my info log file to include Warning event level from Microsoft.AspNetCore
I got the 2 logs files how I wanted by commenting out and in 1. and 2. below
// 1. for debug file sink I want AspNetCore.Information or Debug level override
.MinimumLevel.Override("Microsoft.AspNetCore", LogEventLevel.Information)
.WriteTo.File($#"{logFilePath}debugx.txt", restrictedToMinimumLevel: LogEventLevel.Debug, rollingInterval: RollingInterval.Day)
// 2. for info and warning file sinks below I want only AspNetCore warnings
//.MinimumLevel.Override("Microsoft.AspNetCore", LogEventLevel.Warning)
It's an interesting one
You want to filter log data and want to populate into different file sinks.
For Example /Logs/Error/Errlog.txt and /Logs/Info/InfoLog.txt
You can achieve this by using Serilog.Expressions nuget package. If time permits, I will paste a working example here.
Serilog.Expressions sample from Serilog
https://github.com/serilog/serilog-expressions/blob/dev/example/Sample/Program.cs
In below example it will exclude Name=User line and only print second line on console
using var log = new LoggerConfiguration()
.Filter.ByExcluding("#m like 'Welcome!%' and Name = 'User'")
.WriteTo.Console()
.CreateLogger();
// Logged normally
log.Information("Welcome!, {Name}", "User");
// Excluded by the filter
log.Information("Welcome!, {Name}", "Domain\\UserName");
Here is the filtering example for \Logs\Info\Info-20210720.txt which filters Error, Fatal or Warning levels. More information here
var exprInfo = "#l='Error' or #l='Fatal' or #l='Warning'";
var loggerInfo = new LoggerConfiguration()
.WriteTo.File(
#"C:\Temp\Logs\Info\Info-.txt",
fileSizeLimitBytes: 1_000_000,
outputTemplate: "{Timestamp:yyyy-MM-dd HH:mm:ss.fff} [{Level}] [{SourceContext}] [{EventId}] {Message}{NewLine}{Exception}",
rollingInterval: RollingInterval.Day,
rollOnFileSizeLimit: true,
shared: true,
flushToDiskInterval: TimeSpan.FromSeconds(1))
.MinimumLevel.Override("Microsoft", LogEventLevel.Debug)
.Filter.ByExcluding(exprInfo)
.CreateLogger();
try
{
loggerInfo.Debug("TEST");
SelfLog.Enable(Console.Out);
var sw = System.Diagnostics.Stopwatch.StartNew();
for (var i = 0; i < 100; ++i)
{
loggerInfo.Information("Hello, file logger!>>>>>>{Count}", i);
loggerInfo.Information("Writing to log file with INFORMATION severity level.");
loggerInfo.Debug("Writing to log file with DEBUG severity level.");
loggerInfo.Warning("Writing to log file with WARNING severity level.");
loggerInfo.Error("Writing to log file with ERROR severity level.");
loggerInfo.Fatal("Writing to log file with CRITICAL severity level.");
}
sw.Stop();
Console.WriteLine($"Elapsed: {sw.ElapsedMilliseconds} ms");
Console.WriteLine($"Size: {new FileInfo("log.txt").Length}");
Console.WriteLine("Press any key to delete the temporary log file...");
Console.ReadKey(true);
}
catch (Exception ex)
{
loggerInfo.Fatal(ex, "Application Start-up for Serilog failed");
throw;
}
finally
{
Log.CloseAndFlush();
}
I solved it by using sub loggers and filters as described in here: How can I override Serilog levels differently for different sinks?
Log.Logger = new LoggerConfiguration()
.MinimumLevel.Debug()
.Enrich.FromLogContext()
// Includes Debug from Microsoft.AspNetCore (noisy)
// useful for deep debugging
.WriteTo.File($#"logs/debug.txt", rollingInterval: RollingInterval.Day)
// Info-with-framework (useful for debugging)
.WriteTo.Logger(lc => lc
.MinimumLevel.Information()
.Filter.ByExcluding("RequestPath in ['/health-check', '/health-check-db']")
.WriteTo.File("logs/info-with-framework.txt", rollingInterval: RollingInterval.Day)
.WriteTo.Console()
)
// Info
// framework minimum level is Warning (normal everyday looking at logs)
.WriteTo.Logger(lc => lc
.MinimumLevel.Information()
.Filter.ByExcluding("RequestPath in ['/health-check', '/health-check-db']")
.Filter.ByExcluding("SourceContext = 'Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware'")
.Filter.ByExcluding(logEvent =>
logEvent.Level < LogEventLevel.Warning &&
Matching.FromSource("Microsoft.AspNetCore").Invoke(logEvent))
.WriteTo.File("logs/info.txt", rollingInterval: RollingInterval.Day))
// Warning (bad things - Warnings, Error and Fatal)
.WriteTo.Logger(lc => lc
.MinimumLevel.Warning()
// stopping duplicate stacktraces, see blog 2021/03/10/a11-serilog-logging-in-razor-pages
.Filter.ByExcluding("SourceContext = 'Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware'")
.WriteTo.File("logs/warning.txt", rollingInterval: RollingInterval.Day))
// SignalR - tweak levels by filtering on these namespaces
// Microsoft.AspNetCore.SignalR
// Microsoft.AspNetCore.Http.Connections
.CreateLogger();
Although this works, there may be a better way https://nblumhardt.com/2016/07/serilog-2-write-to-logger/
I feel like you don't need those minium level override calls. The restricted to minimum level parameter in the sinks will take are of filtering.
You do need to set the minimum level to info so the info sink can work.

log4j in spark main code

i want to collect logger according to log4j, i can get the hadoop logger but can't get the main code logger.
tow attached - 1 log4j.properties
log4j.rootLogger=INFO, rolling
log4j.appender.rolling=org.apache.log4j.RollingFileAppender
log4j.appender.rolling.layout=org.apache.log4j.PatternLayout
log4j.appender.rolling.layout.conversionPattern=[%d] %p %m (%c)%n
log4j.appender.rolling.maxFileSize=50MB
log4j.appender.rolling.maxBackupIndex=5
log4j.appender.rolling.file=/home/spark.log
log4j.appender.rolling.encoding=UTF-8
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=WARN
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=WARN
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=WARN
log4j.logger.com.test.main.Main=WARN
tow attached - 2 com.test.main.Main
object Main {
def main(args: Array[String]): Unit = {
val logger = LogManager.getLogger(Main.getClass)
logger.info("info\n")
logger.warn("warn\n")
logger.debug("DEBUG\n")
logger.error("EEOR\n")
now,I can get spark logger in /home/spark.log, such as
[2018-08-14 18:04:28,852] INFO Running Spark version 2.2.0.cloudera1 (org.apache.spark.SparkContext)
[2018-08-14 18:04:29,705] INFO Submitted application: steaming_Test (org.apache.spark.SparkContext)
but no logger in main code such as
18:05:17.250 [main] ERROR com.sgm.bgdt.main.Main$ - EEOR
is there a error setting for "log4j.logger.com.test.main.Main=WARN" or something wrong in my main code?
PS.
this is my spark submit
--driver-java-options "-Dlog4j.configuration=file:/path/log4j.properties
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:/path/log4j.properties

Exporting functions to multiple callers in different files

I'm building a logging module that can be called by multiple callers located in different files.
My objective is to initialize the log file at the start of program and have the callers just call a function that logs to the file initialized earlier without going through the whole initialisation again.
I can't quite grasp the concept of module exports hence I'm hoping that you can help.
The actual logging occurs on the method write. On the main app.js file, I can initiate and log just fine.
However on a different file, I'm having a mental block on how I can just log to the file without going through creating the logfile again.
var fs = require('fs');
var fd = {},
log = {},
debug = false;
var tnlog = function(env, file, hostname, procname, pid) {
if (env == 'development')
debug = true;
fd = fs.createWriteStream(file, { flags: 'a', encoding: 'utf8', mode: 0644 });
log = { hostname: hostname, procname: procname, pid: pid };
};
tnlog.prototype.write = function(level, str) {
if (debug)
console.log(str);
else {
log.timestamp = Date.now();
log.level = level;
log.str = str;
fd.write(JSON.stringify(log) + '\n');
}
};
exports.tnlog = tnlog;
This is how I initialize and logging on the main file:
var logfile = '/var/log/node/www/app.log';
var tnlog = require('./lib/tnlog').tnlog,
log = new tnlog(app.get('env'), logfile, os.hostname(), appname, process.pid);
If you can suggest a better way of doing things, I definitely will appreciate that.
edit
The simplest solution would be to put
var logfile = '/var/log/node/www/app.log';
var tnlog = require('./lib/tnlog').tnlog,
module.exports = new tnlog(app.get('env'), logfile, os.hostname(), appname, process.pid);
into a separate file (mylogger.js), and require that anywhere you want to log something with logger = require "./mylogger.js . You always get back that single instance of tnlog, because node caches the exported value.
I also see you might be using Express, so you could also do
app.set("logger",new tnlog(app.get('env'), logfile, os.hostname(), appname, process.pid))
and retrieve it anywhere you have a reference to the app object with app.get("logger").
old
More complicated:
You must decide whether you want to support logging to different files while the same app is running. If so, you absolutely need to create an object for each log file. You don't have to export a constructor function per se, you could also export a kind of hybrid between a factory and a singleton pattern.
You would create something like:
var loggers = {}
module.exports = function getLogger(env, file, hostname, procname, pid) {
key = env + file + hostname + procname + pid)
if(loggers[key]) return loggers[key]
return loggers[key] = new Logger(env, file, hostname, procname, pid)
}
I.e. you check if you have already created the logger object, based on concatenating the function argument variables.
You then need to create a proper Logger constructor of course, but I assume you know a bit of Javascript.
Note that the loggers object will remain a private variable, just like the Logger constructor. Because node.js caches the object the module exports, the value of the loggers object will persist over multiple calls to require, as part of the getLogger closure.

Configuring log4net appenders via XML file *and* code

I started to play with log4net today and so far, I really like it. In order to preserve our current logging functionality, the app needs to create a new log file whenever the application is started. The log file name has the date and time stamp encoded in it. Currently, I've got log4net configured via an XmlConfigurator, which works great, except that the filename for my RollingFileAppender is hardcoded in the configuration XML file.
I'd like to continue to use the XmlConfigurator, but after calling Configure(), I want to get at the RollingFileAppender and, in code, change its file value to be a dynamically-generated string. The sample documentation online seems to be down right now, but I've poked through the SDK reference, and it looks like I could use the Heirarchy and GetAppenders() to do what I need to do. Am I on the right track?
Ok, I took a stab at this and tried the following code, which didn't work:
private static readonly ILog _log = LogManager.GetLogger(typeof(GUI));
// in the config file, I've set the filename to example.log, and it works
XmlConfigurator.Configure(new FileInfo("log_config.xml"));
Hierarchy hierarchy = LogManager.GetRepository() as Hierarchy;
if(hierarchy != null) {
// get the appenders
IAppender[] appenders = hierarchy.GetAppenders();
// change the filename for the RollingFileAppender
foreach( IAppender a in appenders) {
RollingFileAppender rfa = a as RollingFileAppender;
if(rfa == null)
continue;
rfa.File = "newfile.log"; // no runtime error, but doesn't work.
}
}
_log.Info("Application started");
Try this snippet:
XmlConfigurator.Configure();
log4net.Repository.ILoggerRepository repo = LogManager.GetRepository();
foreach (log4net.Appender.IAppender appender in repo.GetAppenders())
{
if (appender.Name.CompareTo("RollingFileAppender") == 0 && appender is log4net.Appender.RollingFileAppender)
{
var appndr = appender as log4net.Appender.RollingFileAppender;
string logPath = "MyApplication.log";
appndr.File = logPath;
appndr.ActivateOptions();
}
I had posted similar article here
Do you in this case need the rolling file appender? If not I would expect that your code would create the desired result if you used the normal file appender.
Edit: Maybe it works with the RollingFile Appender if you call ActivateOptions() on the appender.

Resources