Spark 2.3 - Log4j Vunlerability - apache-spark

In our project, running spark 2.3 with 7 nodes.
Recently as part of Security scan, log4j vulnerability is reported by security Team.
We can see log4j 1.x jar in the spark folder (/opt/spark/jars/log4j-1.2.17.jar).
We tried to replace the jar with log4j 2.17.1 version and tried to run the spark. But Spark is failing with "NoClassDefFoundError" for class org/apache/log4j/or/RendererMap
Please help me to resolve this issue.

Try using log4j-1.2-api of version 2.17.1
https://mvnrepository.com/artifact/org.apache.logging.log4j/log4j-1.2-api

You need to copy 3 jars(core,api,bridge) from https://archive.apache.org/dist/logging/log4j/ and put in spark/jar folder.
Refer this page for details.
https://logging.apache.org/log4j/2.x/manual/migration.html

Related

How to solve vulnerability issues based on Log4j?

I found multiple vulnerability issues related to Log4j in Vulnerability assessment and it is listed below:
Apache Log4j Unsupported Version Detection
Apache Log4j 1.x Multiple Vulnerabilities
Apache Log4j 1.2 JMSAppender Remote Code Execution (CVE-2021-4104)
It suggest to upgrade its version.And it comes under Pyspark package.So I uninstalled the package since I am not using it.But again the same issue found.
How can I solve this issue? Can anyone suggest a solution to solve this problem?

How to Upgrade to log4j 2.x without changing any import statements

What is the best way to upgrade from log4j 1.x to 2.x
I have an ANT project. I just deleted the old jar and replaced it with the new log4j is that enough ?
Log4j 2 does not use the same configuration file format. However, Log4j 2.13.0 introduced experimental support for some Log4j 1 configuration files. So you have two choices:
Follow the steps outlined at Migrating from Log4j 1.x which involves including the log4j-1.2-api jar and converting your configuration files to Log4j 2 format or
Include the log4j-1.2-api jar but instead of converting your configuration files follow the steps at Log4j 2 Compatibility with Log4j 1
Note that since option 2 is experimental you may have configurations that Log4j 2 cannot handle. If that happens the Log4j 2 team welcomes you to report a Jira issue to determine how the support can be improved.

What is LongAdder related to cassandra+spark connector?

When i load data into cassandra with using databricks, its getting the issue with
Caused by: java.lang.NoClassDefFoundError: com/twitter/jsr166e/LongAdder
Its simple saveToCassandra to table.
I looked this twitter jsr166e jar in maven , its very old, added in 2013,
I don't know why this jar is not available in Spark+cassandra_coonector
That error indicates you are missing dependencies and / or the Spark Cassandra connector is not on the runtime classpath of the Spark application. Not sure how you installed the connector but you should have used the packages method to ensure that dependencies are met and the Connector is correctly configured.
Read more HERE
Hope that helps,
Pat

Spark do not resolve ivy specified repositories after upgrade form 2.2.1 to 2.3

We have spark configuration that uses spark.jars.ivySettings to customize jars resolution.
Spark jobs run in environment without internet access, so we want to skip maven central calls and use our repositories.
In spark 2.2.1 everything was working fine, but when we upgraded to 2.3, repositories specified in ivy settings are ignored. As the result our jobs are failing due to missing dependencies.
Specifying our repos with new spark.jars.repositories makes it visible for spark, but does not change an order (so it will always first check maven central, which we cannot allow).
Is this some bug introduced in new version? Or I'm doing something wrong here?
Ok, I found where is the problem. So apparently the way of acquiring spark.jars.ivySettings has changed in 2.3. Now system properties are used for that:
sys.props.get("spark.jars.ivySettings")
This change is not followed by documentation update, and for me it seems like a bug.

What is the difference between log4j-1.2-api and log4j-api (without the 1.2 suffix)

The maven repo for Log4j
Could somebody tell what is the difference between these 2 apis ? I assume the one without 1.2 suffix is the latest.
Log4j-1.2-api's are bridge api's which is used for making the applications which works on log4j-1.2.xx to work using `log4j 2
More Info
log4j-1.2-api is a brige to let log4j code log to log4j2 logfile. log4j-api is you can use it api in code to log

Resources