log4j2 TimeBasedTriggeringPolicy isn't logging anything - log4j

I'm looking to create a simple TimeBasedTriggeringPolicy that creates a new log file every 5 min. But it's not logging anything. I'm not sure what I'm doing wrong. Here are the details, thank you!
My log4j dependencies:
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>2.17.1</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-to-slf4j</artifactId>
<version>2.17.1</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.17.1</version>
</dependency>
log4j.rootLogger=INFO,rolling
log4j.appender.rolling.type=RollingFile
log4j.appender.rolling.name=fileAppender
log4j.appender.rolling.fileName=${spark.yarn.app.container.log.dir}/spark-${date:yyyyMMdd_HH-mm}.log
log4j.appender.rolling.filePattern=${spark.yarn.app.container.log.dir}/spark-%d{yyyyMMdd_HH-mm}-%i.log
log4j.appender.rolling.layout.type=PatternLayout
log4j.appender.rolling.layout.pattern=%d %p %t %c - %m%n
log4j.appender.rolling.policies.type=Policies
log4j.appender.rolling.policies.time.type=TimeBasedTriggeringPolicy
log4j.appender.rolling.policies.time.interval=5
log4j.appender.rolling.policies.time.modulate=true
log4j.rootLogger.level=info
log4j.rootLogger.appenderRef.rolling.ref=fileLogger
Error Log:
log4j:ERROR Could not find value for key log4j.appender.rolling
log4j:ERROR Could not instantiate appender named "rolling".
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/mnt1/yarn/usercache/hadoop/filecache/45/__spark_libs__6130489434181934811.zip/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Final Working Result (creates log every minute, and rotates monthly):
log4j.rootLogger=INFO, loggerId
log4j.appender.loggerId=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.loggerId.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.loggerId.rollingPolicy.ActiveFileName=${spark.yarn.app.container.log.dir}/spark.log
log4j.appender.loggerId.rollingPolicy.FileNamePattern=${spark.yarn.app.container.log.dir}/spark_%d{dd-HHmm}.log.gz
log4j.appender.loggerId.layout=org.apache.log4j.PatternLayout
log4j.appender.loggerId.layout.ConversionPattern=%d %p %t %c - %m%n
log4j.appender.loggerId.encoding=UTF-8

There are several errors in your configuration (run with -Dlog4j.debug=true to catch them all):
Log4j2 does not have properties prefixed with log4j., remove the prefix,
There is no property spark.yarn.app.container.log.dir in your configuration. If you meant to use a Java system property use ${sys:spark.yarn.app.container.log.dir},
You define the root logger twice:
once with the shorthand notation introduced in version 2.17.2: rootLogger = INFO, rolling,
another time using the full notation:
rootLogger.level = INFO
rootLogger.appenderRef.rolling ref = fileLogger
neither definition uses the name you gave to the appender: fileAppender.
You don't have a <SizeBasedRollingPolicy>, but your pattern contains %i. If you remove -%i, your fileName and filePattern are identical: you probably want to use a direct rollover strategy instead and omit fileName.

This creates log every minute, we can't do 5 minute increments unless we use xml or customization.
This rotates monthly, I took out the yyyyMM so it can rotate monthly, take out dd and it'll rotate daily:
log4j.rootLogger=INFO, loggerId
log4j.appender.loggerId=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.loggerId.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.loggerId.rollingPolicy.ActiveFileName=${spark.yarn.app.container.log.dir}/spark.log
log4j.appender.loggerId.rollingPolicy.FileNamePattern=${spark.yarn.app.container.log.dir}/spark_%d{dd-HHmm}.log.gz
log4j.appender.loggerId.layout=org.apache.log4j.PatternLayout
log4j.appender.loggerId.layout.ConversionPattern=%d %p %t %c - %m%n
log4j.appender.loggerId.encoding=UTF-8

Related

Class path contains multiple SLF4J bindings,log4j-slf4j-impl-2.7.jar,slf4j-log4j12-1.7.21.jar

I'm getting the following error. It seems there are multiple logging frameworks bound to sl4j. Not sure how to resolve this. Any help is greatly appreciated.
14:42:35,411 ERROR [stderr] (MSC service thread 1-3) SLF4J: Class path contains multiple SLF4J bindings.
14:42:35,412 ERROR [stderr] (MSC service thread 1-3) SLF4J: Found binding in [vfs:/content/offer-warehouse-processor-api.war/WEB-INF/lib/log4j-slf4j-impl-2.7.jar/org/slf4j/impl/StaticLoggerBinder.class]
14:42:35,412 ERROR [stderr] (MSC service thread 1-3) SLF4J: Found binding in [vfs:/content/offer-warehouse-processor-api.war/WEB-INF/lib/slf4j-log4j12-1.7.21.jar/org/slf4j/impl/StaticLoggerBinder.class]
The message tells you that you have both slf4j-log4j12-1.7.21.jar and log4j-slf4j-impl-2.7.jar on your classpath. slf4j-log4j12 routes all SLF4J logging to log4j 1.2. Log4j-sfl4j-impl routes all logging to log4j 2. You need to remove the one you don't want. For example, if you want to use log4j 2 then remove slf4j-log4j12-1.7.21.jar from your project. If you aren't sure how it got included and your are using Maven then run
mvn dependency:tree >mvn.txt
and then look in the mvn.txt file that was created and find where the jar is being included and what dependency it is under from your pom.xml. Then add an exclusion like
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
in the dependency that is including it.

Logback to log4j bridge

My entire system is logged by slf4j with log4j implementation.
I had a problem when a new module used a logback-classic depdency for logging which cannot be excluded from pom.xml file since it breaks it.
First I tried to look for solution in this 3rd side depdency but couldn't find any solutions so I thought maybe a bridge between them could be something that solves it.
What I mainly looking for is to split my logs between the deafult console logging of the dependency to my own log4j.xml loggers and appenders so I can use the separetly..
Is there any bridge so I could use both logback-classic and log4j under slf4j with example?
Thanks!
You said:
I had a problem when a new module used a logback-classic depdency for logging which cannot be excluded from pom.xml file since it breaks it.
Is this an assumption or did you try it? Also, if you did try it did you remove the logback-core dependency as well? The only way removing these dependencies would break the module is if the module depends on the logging implementation since logback natively implements the slf4j API. If the module does depend on the implementation rather than the API/interface I don't think there's anything you can do without either removing those dependencies (changing the module source code) or writing some stubbed versions of the implementation classes that the module depends on.
When I write code that follows the pattern in the logback manual I'm able to swap the implementation from logback to log4j2 without any issues as long as I don't introduce dependencies on the logback implementation classes.
Here is the example I wrote:
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class Main {
private Logger log = LoggerFactory.getLogger(Main.class);
public static void main(String[] args) {
Main main = new Main();
main.main();
}
public void main(){
log.trace("trace msg");
log.debug("debug msg");
log.info("info msg");
log.warn("warn msg");
log.error("Error msg");
log.info(log.getClass().getName());
}
}
Here are the dependencies in the pom:
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.22</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-core</artifactId>
<version>1.2.3</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.2.3</version>
</dependency>
Here is the output:
00:48:26.378 [main] DEBUG blah.Main - debug msg
00:48:26.380 [main] INFO blah.Main - info msg
00:48:26.380 [main] WARN blah.Main - warn msg
00:48:26.380 [main] ERROR blah.Main - Error msg
00:48:26.380 [main] INFO blah.Main - ch.qos.logback.classic.Logger
Now I change the pom to replace the logback jars with log4j2:
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.22</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
<version>2.7</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>2.7</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.7</version>
</dependency>
Here's the output after making this change:
2017-09-03 00:52:21,630 INFO b.Main [main] info msg
2017-09-03 00:52:21,631 WARN b.Main [main] warn msg
2017-09-03 00:52:21,631 ERROR b.Main [main] Error msg
2017-09-03 00:52:21,632 INFO b.Main [main] org.apache.logging.slf4j.Log4jLogger
So based on this I think you should, if things are implemented the "right way", be able to swap the logback jars with log4j2 and it should "just work".
You also said:
What I mainly looking for is to split my logs between the deafult console logging of the dependency to my own log4j.xml loggers and appenders so I can use the separetly..
Now it's entirely clear to me what you were asking but I think you wanted to have log messages from the module go to console as well as any logs you are using with your log4j2 configuration. If this is the case that's as simple as modifying your log4j2 configuration - add a logger with the appropriate name and assign the appropriate appenders. For example if your module's classes are com.my.package.Class1, com.my.package.Class2, com.my.package.Class3, etc then you could create a logger for com.my.package and give it a console appender along with the appropriate file appenders.
Hope this helps!

slf4j logger is not creating log file in my spring batch application

I am using slf4j in my application for logging purpose. But it's not creating log file in file system but it's logging to console. Below is my log4j.properties file:
# Root logger option
log4j.rootLogger=DEBUG, RollingAppender,stdout
# Redirect log messages to console
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
# Redirect log messages to a log file
log4j.appender.RollingAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.RollingAppender.File=E:\\SalesTerritory.log
log4j.appender.RollingAppender.DatePattern='.'yyyy-MM-dd
log4j.appender.RollingAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.RollingAppender.layout.ConversionPattern=[%p] %d %c %M - %m%n
And dependency I am using in pom.xml for logging is:
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.6</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.5</version>
</dependency>[enter image description here][1]
Please find the dependency jar list in attached image files.
[1]: https://i.stack.imgur.
com/KGa2x.png
use this below Dependency for log4j
<!-- https://mvnrepository.com/artifact/log4j/log4j -->
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
Use INFO in log4j.rootLogger instead of DEBUG because INFO designates informational messages that highlight the progress of the crawl at a
coarse-grained level.

Spark doc build process hangs on Failed to load class "org.slf4j.impl.StaticLoggerBinder"

I followed this
https://github.com/apache/spark/blob/master/docs/README.md
to build spark docs,but it hangs on:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
after half an hour,no further info got printed out.I tried to add slf4j-simple-1.7.12.jar into spark/lib_managed/jars,then rerun
jekyll build
still hang on these messages,how to solve the problem?
For me the solution was adding Logback Classic Module to the classpath:
http://mvnrepository.com/artifact/ch.qos.logback/logback-classic/1.1.7
and removing any other implementations of StaticLoggerBinder.
There's an explanation in the Spark documentation:
http://sparkjava.com/documentation#how-do-i-enable-logging
You might have seen this message when starting Spark:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J:
See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.Copy
To enable logging, just add the following dependency to your project:
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>1.7.21</version>
</dependency>
For Gradle, check this thread for more details:
How to set SLF4J in IntelliJ with Gradle
Then you'll might need to configure slf4j properly, by creating a config file, but that's a different topic. Check this and this if that's the case.

How to stop INFO messages displaying on spark console?

I'd like to stop various messages that are coming on spark shell.
I tried to edit the log4j.properties file in order to stop these message.
Here are the contents of log4j.properties
# Define the root logger with appender file
log4j.rootCategory=WARN, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
# Settings to quiet third party logs that are too verbose
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
But messages are still getting displayed on the console.
Here are some example messages
15/01/05 15:11:45 INFO SparkEnv: Registering BlockManagerMaster
15/01/05 15:11:45 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20150105151145-b1ba
15/01/05 15:11:45 INFO MemoryStore: MemoryStore started with capacity 0.0 B.
15/01/05 15:11:45 INFO ConnectionManager: Bound socket to port 44728 with id = ConnectionManagerId(192.168.100.85,44728)
15/01/05 15:11:45 INFO BlockManagerMaster: Trying to register BlockManager
15/01/05 15:11:45 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager 192.168.100.85:44728 with 0.0 B RAM
15/01/05 15:11:45 INFO BlockManagerMaster: Registered BlockManager
15/01/05 15:11:45 INFO HttpServer: Starting HTTP Server
15/01/05 15:11:45 INFO HttpBroadcast: Broadcast server star
How do I stop these?
Edit your conf/log4j.properties file and change the following line:
log4j.rootCategory=INFO, console
to
log4j.rootCategory=ERROR, console
Another approach would be to :
Start spark-shell and type in the following:
import org.apache.log4j.Logger
import org.apache.log4j.Level
Logger.getLogger("org").setLevel(Level.OFF)
Logger.getLogger("akka").setLevel(Level.OFF)
You won't see any logs after that.
Other options for Level include: all, debug, error, fatal, info, off, trace, trace_int, warn
Details about each can be found in the documentation.
Right after starting spark-shell type ;
sc.setLogLevel("ERROR")
you could put this in preload file and use like:
spark-shell ... -I preload-file ...
In Spark 2.0 (Scala):
spark = SparkSession.builder.getOrCreate()
spark.sparkContext.setLogLevel("ERROR")
API Docs : https://spark.apache.org/docs/2.2.0/api/scala/index.html#org.apache.spark.sql.SparkSession
For Java:
spark = SparkSession.builder.getOrCreate();
spark.sparkContext().setLogLevel("ERROR");
All the methods collected with examples
Intro
Actually, there are many ways to do it.
Some are harder from others, but it is up to you which one suits you best. I will try to showcase them all.
#1 Programatically in your app
Seems to be the easiest, but you will need to recompile your app to change those settings. Personally, I don't like it but it works fine.
Example:
import org.apache.log4j.{Level, Logger}
val rootLogger = Logger.getRootLogger()
rootLogger.setLevel(Level.ERROR)
Logger.getLogger("org.apache.spark").setLevel(Level.WARN)
Logger.getLogger("org.spark-project").setLevel(Level.WARN)
You can achieve much more just using log4j API.
Source: [Log4J Configuration Docs, Configuration section]
#2 Pass log4j.properties during spark-submit
This one is very tricky, but not impossible. And my favorite.
Log4J during app startup is always looking for and loading log4j.properties file from classpath.
However, when using spark-submit Spark Cluster's classpath has precedence over app's classpath! This is why putting this file in your fat-jar will not override the cluster's settings!
Add -Dlog4j.configuration=<location of configuration file> to
spark.driver.extraJavaOptions (for the driver) or
spark.executor.extraJavaOptions (for executors).
Note that if using a
file, the file: protocol should be explicitly provided, and the file
needs to exist locally on all the nodes.
To satisfy the last condition, you can either upload the file to the location available for the nodes (like hdfs) or access it locally with driver if using deploy-mode client. Otherwise:
upload a custom log4j.properties using spark-submit, by adding it to
the --files list of files to be uploaded with the application.
Source: Spark docs, Debugging
Steps:
Example log4j.properties:
# Blacklist all to warn level
log4j.rootCategory=WARN, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n
# Whitelist our app to info :)
log4j.logger.com.github.atais=INFO
Executing spark-submit, for cluster mode:
spark-submit \
--master yarn \
--deploy-mode cluster \
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties" \
--files "/absolute/path/to/your/log4j.properties" \
--class com.github.atais.Main \
"SparkApp.jar"
Note that you must use --driver-java-options if using client mode. Spark docs, Runtime env
Executing spark-submit, for client mode:
spark-submit \
--master yarn \
--deploy-mode client \
--driver-java-options "-Dlog4j.configuration=file:/absolute/path/to/your/log4j.properties" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties" \
--files "/absolute/path/to/your/log4j.properties" \
--class com.github.atais.Main \
"SparkApp.jar"
Notes:
Files uploaded to spark-cluster with --files will be available at root dir, so there is no need to add any path in file:log4j.properties.
Files listed in --files must be provided with absolute path!
file: prefix in configuration URI is mandatory.
#3 Edit cluster's conf/log4j.properties
This changes global logging configuration file.
update the $SPARK_CONF_DIR/log4j.properties file and it will be
automatically uploaded along with the other configurations.
Source: Spark docs, Debugging
To find your SPARK_CONF_DIR you can use spark-shell:
atais#cluster:~$ spark-shell
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.1.1
/_/
scala> System.getenv("SPARK_CONF_DIR")
res0: String = /var/lib/spark/latest/conf
Now just edit /var/lib/spark/latest/conf/log4j.properties (with example from method #2) and all your apps will share this configuration.
#4 Override configuration directory
If you like the solution #3, but want to customize it per application, you can actually copy conf folder, edit it contents and specify as the root configuration during spark-submit.
To specify a different configuration directory other than the default “SPARK_HOME/conf”, you can set SPARK_CONF_DIR. Spark will use the configuration files (spark-defaults.conf, spark-env.sh, log4j.properties, etc) from this directory.
Source: Spark docs, Configuration
Steps:
Copy cluster's conf folder (more info, method #3)
Edit log4j.properties in that folder (example in method #2)
Set SPARK_CONF_DIR to this folder, before executing spark-submit,
example:
export SPARK_CONF_DIR=/absolute/path/to/custom/conf
spark-submit \
--master yarn \
--deploy-mode cluster \
--class com.github.atais.Main \
"SparkApp.jar"
Conclusion
I am not sure if there is any other method, but I hope this covers the topic from A to Z. If not, feel free to ping me in the comments!
Enjoy your way!
Thanks #AkhlD and #Sachin Janani for suggesting changes in .conf file.
Following code solved my issue:
1) Added import org.apache.log4j.{Level, Logger} in import section
2) Added following line after creation of spark context object i.e. after val sc = new SparkContext(conf):
val rootLogger = Logger.getRootLogger()
rootLogger.setLevel(Level.ERROR)
Use below command to change log level while submitting application using spark-submit or spark-sql:
spark-submit \
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:<file path>/log4j.xml" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:<file path>/log4j.xml"
Note: replace <file path> where log4j config file is stored.
Log4j.properties:
log4j.rootLogger=ERROR, console
# set the log level for these components
log4j.logger.com.test=DEBUG
log4j.logger.org=ERROR
log4j.logger.org.apache.spark=ERROR
log4j.logger.org.spark-project=ERROR
log4j.logger.org.apache.hadoop=ERROR
log4j.logger.io.netty=ERROR
log4j.logger.org.apache.zookeeper=ERROR
# add a ConsoleAppender to the logger stdout to write to the console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.layout=org.apache.log4j.PatternLayout
# use a simple message format
log4j.appender.console.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
log4j.xml
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/">
<appender name="console" class="org.apache.log4j.ConsoleAppender">
<param name="Target" value="System.out"/>
<layout class="org.apache.log4j.PatternLayout">
<param name="ConversionPattern" value="%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n" />
</layout>
</appender>
<logger name="org.apache.spark">
<level value="error" />
</logger>
<logger name="org.spark-project">
<level value="error" />
</logger>
<logger name="org.apache.hadoop">
<level value="error" />
</logger>
<logger name="io.netty">
<level value="error" />
</logger>
<logger name="org.apache.zookeeper">
<level value="error" />
</logger>
<logger name="org">
<level value="error" />
</logger>
<root>
<priority value ="ERROR" />
<appender-ref ref="console" />
</root>
</log4j:configuration>
Switch to FileAppender in log4j.xml if you want to write logs to file instead of console. LOG_DIR is a variable for logs directory which you can supply using spark-submit --conf "spark.driver.extraJavaOptions=-D.
<appender name="file" class="org.apache.log4j.DailyRollingFileAppender">
<param name="file" value="${LOG_DIR}"/>
<param name="datePattern" value="'.'yyyy-MM-dd"/>
<layout class="org.apache.log4j.PatternLayout">
<param name="ConversionPattern" value="%d [%t] %-5p %c %x - %m%n"/>
</layout>
</appender>
Another important thing to understand here is, when job is launched in distributed mode ( deploy-mode cluster and master as yarn or mesos) the log4j configuration file should exist on driver and worker nodes (log4j.configuration=file:<file path>/log4j.xml) else log4j init will complain-
log4j:ERROR Could not read configuration file [log4j.properties].
java.io.FileNotFoundException: log4j.properties (No such file or
directory)
Hint on solving this problem-
Keep log4j config file in distributed file system(HDFS or mesos) and add external configuration using log4j PropertyConfigurator.
or use sparkContext addFile to make it available on each node then use log4j PropertyConfigurator to reload configuration.
You set disable the Logs by setting its level to OFF as follows:
Logger.getLogger("org").setLevel(Level.OFF);
Logger.getLogger("akka").setLevel(Level.OFF);
or edit log file and set log level to off by just changing the following property:
log4j.rootCategory=OFF, console
I just add this line to all my pyspark scripts on top just below the import statements.
SparkSession.builder.getOrCreate().sparkContext.setLogLevel("ERROR")
example header of my pyspark scripts
from pyspark.sql import SparkSession, functions as fs
SparkSession.builder.getOrCreate().sparkContext.setLogLevel("ERROR")
Answers above are correct but didn't exactly help me as there was additional information I required.
I have just setup Spark so the log4j file still had the '.template' suffix and wasn't being read. I believe that logging then defaults to Spark core logging conf.
So if you are like me and find that the answers above didn't help, then maybe you too have to remove the '.template' suffix from your log4j conf file and then the above works perfectly!
http://apache-spark-user-list.1001560.n3.nabble.com/disable-log4j-for-spark-shell-td11278.html
In Python/Spark we can do:
def quiet_logs( sc ):
logger = sc._jvm.org.apache.log4j
logger.LogManager.getLogger("org"). setLevel( logger.Level.ERROR )
logger.LogManager.getLogger("akka").setLevel( logger.Level.ERROR )
The after defining Sparkcontaxt 'sc'
call this function by : quiet_logs( sc )
tl;dr
For Spark Context you may use:
sc.setLogLevel(<logLevel>)
where loglevel can be ALL, DEBUG, ERROR, FATAL, INFO, OFF, TRACE or
WARN.
Details-
Internally, setLogLevel calls org.apache.log4j.Level.toLevel(logLevel) that it then uses to set using org.apache.log4j.LogManager.getRootLogger().setLevel(level).
You may directly set the logging levels to OFF using:
LogManager.getLogger("org").setLevel(Level.OFF)
You can set up the default logging for Spark shell in conf/log4j.properties. Use conf/log4j.properties.template as a starting point.
Setting Log Levels in Spark Applications
In standalone Spark applications or while in Spark Shell session, use the following:
import org.apache.log4j.{Level, Logger}
Logger.getLogger(classOf[RackResolver]).getLevel
Logger.getLogger("org").setLevel(Level.OFF)
Logger.getLogger("akka").setLevel(Level.OFF)
Disabling logging(in log4j):
Use the following in conf/log4j.properties to disable logging completely:
log4j.logger.org=OFF
Reference: Mastering Spark by Jacek Laskowski.
Simply add below param to your spark-shell OR spark-submit command
--conf "spark.driver.extraJavaOptions=-Dlog4jspark.root.logger=WARN,console"
Check exact property name (log4jspark.root.logger here) from log4j.properties file.
Hope this helps, cheers!
Adding the following to the PySpark did the job for me:
self.spark.sparkContext.setLogLevel("ERROR")
self.spark is the spark session (self.spark = spark_builder.getOrCreate())
Simple to do on the command line...
spark2-submit --driver-java-options="-Droot.logger=ERROR,console" ..other options..
An interesting idea is to use the RollingAppender as suggested here: http://shzhangji.com/blog/2015/05/31/spark-streaming-logging-configuration/
so that you don't "polute" the console space, but still be able to see the results under $YOUR_LOG_PATH_HERE/${dm.logging.name}.log.
log4j.rootLogger=INFO, rolling
log4j.appender.rolling=org.apache.log4j.RollingFileAppender
log4j.appender.rolling.layout=org.apache.log4j.PatternLayout
log4j.appender.rolling.layout.conversionPattern=[%d] %p %m (%c)%n
log4j.appender.rolling.maxFileSize=50MB
log4j.appender.rolling.maxBackupIndex=5
log4j.appender.rolling.file=$YOUR_LOG_PATH_HERE/${dm.logging.name}.log
log4j.appender.rolling.encoding=UTF-8
Another method that solves the cause is to observe what kind of loggings do you usually have (coming from different modules and dependencies), and set for each the granularity for the logging, while turning "quiet" third party logs that are too verbose:
For instance,
# Silence akka remoting
log4j.logger.Remoting=ERROR
log4j.logger.akka.event.slf4j=ERROR
log4j.logger.org.spark-project.jetty.server=ERROR
log4j.logger.org.apache.spark=ERROR
log4j.logger.com.anjuke.dm=${dm.logging.level}
log4j.logger.org.eclipse.jetty=WARN
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO
If you don't have the ability to edit the java code to insert the .setLogLevel() statements and you don't want yet more external files to deploy, you can use a brute force way to solve this. Just filter out the INFO lines using grep.
spark-submit --deploy-mode client --master local <rest-of-cmd> | grep -v -F "INFO"
Adjust conf/log4j.properties as described by other
log4j.rootCategory=ERROR, console
Make sure while executing your spark job you pass --file flag with log4j.properties file path
If it still doesn't work you might have a jar that has log4j.properties that is being called before your new log4j.properties. Remove that log4j.properties from jar (if appropriate)
sparkContext.setLogLevel("OFF")
In addition to all the above posts, here is what solved the issue for me.
Spark uses slf4j to bind to loggers. If log4j is not the first binding found, you can edit log4j.properties files all you want, the loggers are not even used. For example, this could be a possible SLF4J output:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/Users/~/.m2/repository/org/slf4j/slf4j-simple/1.6.6/slf4j-simple-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/Users/~/.m2/repository/org/slf4j/slf4j-log4j12/1.7.19/slf4j-log4j12-1.7.19.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory]
So here the SimpleLoggerFactory was used, which does not care about log4j settings.
Excluding the slf4j-simple package from my project via
<dependency>
...
<exclusions>
...
<exclusion>
<artifactId>slf4j-simple</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
</exclusions>
</dependency>
resolved the issue, as now the log4j logger binding is used and any setting in log4j.properties is adhered to.
F.Y.I. my log4j properties file contains (besides the normal configuration)
log4j.rootLogger=WARN, stdout
...
log4j.category.org.apache.spark = WARN
log4j.category.org.apache.parquet.hadoop.ParquetRecordReader = FATAL
log4j.additivity.org.apache.parquet.hadoop.ParquetRecordReader=false
log4j.logger.org.apache.parquet.hadoop.ParquetRecordReader=OFF
Hope this helps!
This one worked for me.
For only ERROR messages to be displayed as stdout, log4j.properties file may look like:
# Root logger option
log4j.rootLogger=ERROR, stdout
# Direct log messages to stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
NOTE: Put log4j.properties file in src/main/resources folder to be
effective.
And if log4j.properties doesn't exist (meaning spark is using log4j-defaults.properties file) then you can create it by going to SPARK_HOME/conf and then mv log4j.properties.template log4j.properties and then proceed with above said changes.
If anyone else is stuck on this,
nothing of the above worked for me.
I had to remove
implementation group: "ch.qos.logback", name: "logback-classic", version: "1.2.3"
implementation group: 'com.typesafe.scala-logging', name: "scala-logging_$scalaVersion", version: '3.9.2'
from my build.gradle for the logs to disappear. TLDR: Don't import any other logging frameworks, you should be fine just using org.apache.log4j.Logger
Another way of stopping logs completely is:
import org.apache.log4j.Appender;
import org.apache.log4j.BasicConfigurator;
import org.apache.log4j.varia.NullAppender;
public class SomeClass {
public static void main(String[] args) {
Appender nullAppender = new NullAppender();
BasicConfigurator.configure(nullAppender);
{...more code here...}
}
}
This worked for me.
An NullAppender is
An Appender that ignores log events. (https://logging.apache.org/log4j/2.x/log4j-core/apidocs/org/apache/logging/log4j/core/appender/NullAppender.html)

Resources