Is there a reference of Spark Log4j properties? - apache-spark

I've been trying to find a reference of all the log4j properties for Spark and having a hard time finding it. I've found a lot of examples where people seen to have pieces of it. But I'm trying to see if there's a reference somewhere that has all of them.
For my particular use case, I'm writing some code that performs a series of data transformations by firing off a spark-submit job, that can then be used/extended by other users. I don't need most of what spark spits out by default and it's easy to just set something like log4j.rootLogger=WARN,stdout. However, there's some useful bits in INFO that would be good to have printed to the screen. In particular:
org.apache.spark.deploy.yarn.Client (Logging.scala:logInfo(54)) -
client token: Token { kind: YARN_CLIENT_TOKEN, service: }
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: ****
start time: 1508185983070
final status: UNDEFINED
tracking URL: ***My tracking URL***
user: ***User***
And even more specifically the tracking URL. Probably also somewhat due to my limited knowledge of Log4j makes this a bit tough. I've tried doing something like:
But that doesn't appear to be a legit logging property. Is there a way to only get that piece of info in Spark? Is there a trick to seeing all the possible logging properties to set?
I was able to figure this out. Most of it was due to me not knowing how works but have a much better handle on it now.
You can set the logger and log level per class, and that persist down to all child classes.
I changed my to look something like this:, RollingAppender, RollingAppender, RollingAppender, RollingAppender
And that redirects pretty much all Spark on YARN logs to a file (slightly modified from the link Thiago shared).
The key things I was missing...
1) I needed to include log4j.logger.CLASS_NAME, I was missing the log4j.logger bit..
2) Need to have log4j.additivity.CLASS_NAME=false. Without this it will just log INFO to the default setting.
It's pretty confusing at first but starts to make a bit of sense once you get the pattern down.

I will suggest you take a look in this article at Hacker Noon:
It is a little bit more complex to generate logs in Spark if you want to generate your own logs in Yarn application as Spark Submit.


