How to rebuild apache Livy with scala 2.12

I'm using Spark 3.1.1 which uses Scala 2.12, and the pre-built Livy downloaded from here uses Scala 2.11 (one could find the folder named repl_2.11-jars/ after unzip).
Referred to the comment made by Aliaksandr Sasnouskikh, Livy needs to be rebuilt or it'll throw error {'msg': 'requirement failed: Cannot find Livy REPL jars.'} even in POST Session.
In the, it mentioned:
By default Livy is built against Apache Spark 2.4.5
If I'd like to rebuild Livy, how could I change the spark version that it is built with?
Thanks in advance.

You can rebuild Livy passing spark-3.0 profile in maven to create a custom build for spark 3, for example:
git clone && \
cd incubator-livy && \
mvn clean package -B -V -e \
-Pspark-3.0 \
-Pthriftserver \
-DskipTests \
-DskipITs \
This profile is defined in pom.xml, the default one installs Spark 3.0.0. You can change it to use different spark version.
As long as I know, Livy supports spark 3.0.x. But worth testing with 3.1.1, and let us know :)

I tried to build Livy for Spark 3.1.1 based on rmakoto's answer and it worked! I tinkered a lot and I couldn't exactly remember what I edited in the pom.xml so I am just going to attach my gist link here.
I also had to edit the python-api/pom.xml file to use Python3 to build since there's some syntax error issues when building with the default pom.xml file. Here's the pom.xml gist for python-api.
After that just build with
mvn clean package -B -V -e \
-Pspark-3.0 \
-Pthriftserver \
-DskipTests \
-DskipITs \

Based on #gamberooni 's changes (but using 3.1.2 instead of 3.1.1 for the Spark version and Hadoop 3.2.0 instead of 3.2.1), this is the diff:
diff --git a/pom.xml b/pom.xml
index d2e535a..5c28ee6 100644
--- a/pom.xml
+++ b/pom.xml
## -79,12 +79,12 ##
- <hadoop.version>2.7.3</hadoop.version>
+ <hadoop.version>3.2.0</hadoop.version>
- <spark.scala-2.12.version>2.4.5</spark.scala-2.12.version>
- <spark.version>${spark.scala-2.11.version}</spark.version>
- <hive.version>3.0.0</hive.version>
+ <spark.scala-2.12.version>3.1.2</spark.scala-2.12.version>
+ <spark.version>${spark.scala-2.12.version}</spark.version>
+ <hive.version>3.1.2</hive.version>
## -1060,7 +1060,7 ##
- <spark.scala-2.12.version>3.0.0</spark.scala-2.12.version>
+ <spark.scala-2.12.version>3.1.2</spark.scala-2.12.version>
## -1072,9 +1072,9 ##
- <>spark-3.0.0-bin-hadoop2.7</>
+ <>spark-3.1.2-bin-hadoop3.2</>
diff --git a/python-api/pom.xml b/python-api/pom.xml
index 8e5cdab..a8fb042 100644
--- a/python-api/pom.xml
+++ b/python-api/pom.xml
## -46,7 +46,7 ##
- <executable>python</executable>
+ <executable>python3</executable>
## -60,7 +60,7 ##
- <executable>python</executable>
+ <executable>python3</executable>

My Spark version is 3.2.1, and my scala version is 2.12.15. I have successfully built and put it into use. I will show my construction process. Pull the master code of livy and modify the pom file as follows:
Finally, execute the package command
mvn clean package -B -V -e -Pspark-3.0 -Pthriftserver -DskipTests -DskipITs -Dmaven.javadoc.skip=true
After Livy's deployment:


PIT-Cucumber plugin not finding scenarios in feature files

Try to institue PIT Mutation testing in a enterprise project. Got it to do existing JUNit tests, but we also have a lot of Cucumber tests that need to be part of the metric. Added pit-cucumber plugin to the maven project, but the output is no scenarios found. Not sure if there is some secret in the config of the plugin that I can't see.
I get this output:
INFO : Sending 0 test classes to minion
Make sure you're using Cucumber version 4.20 jars with pitest-cucumber-plugin 0.8
Everything else looks good. You may not need to specify targetClasses and targetTests.

Using log4j2 in Spark java application

I'm trying to use log4j2 logger in my Spark job. Essential requirement: log4j2 config is located outside classpath, so I need to specify its location explicitly. When I run my code directly within IDE without using spark-submit, log4j2 works well. However when I submit the same code to Spark cluster using spark-submit, it fails to find log42 configuration and falls back to default old log4j.
Launcher command
${SPARK_HOME}/bin/spark-submit \
--class \
--verbose \
--master 'local[*]' \
--files "log4j2.xml" \
--conf spark.executor.extraJavaOptions="-Dlog4j.configurationFile=log4j2.xml" \
--conf spark.driver.extraJavaOptions="-Dlog4j.configurationFile=log4j2.xml" \
Log4j2 dependencies in maven
. . .
<!-- Bridge log4j to log4j2 -->
<!-- Bridge slf4j to log4j2 -->
Any ideas what I could miss?
Apparently at the moment there is no official support official for log4j2 in Spark. Here is detailed discussion on the subject:
On practical side that means:
If you have access to Spark configs and jars and can modify them, you still can use log4j2 after manually adding log4j2 jars to SPARK_CLASSPATH, and providing log4j2 configuration file to Spark.
If you run on managed Spark cluster and have no access to Spark jars/configs, then you still can use log4j2, however its use will be limited to the code executed at driver side. Any code part running by executors will use Spark executors logger (which is old log4j)
Spark falls back to log4j because it probably cannot initialize logging system during startup (your application code is not added to classpath).
If you are permitted to place new files on your cluster nodes then create directory on all of them (for example /opt/spark_extras), place there all log4j2 jars and add two configuration options to spark-submit:
--conf spark.executor.extraClassPath=/opt/spark_extras/*
--conf spark.driver.extraClassPath=/opt/spark_extras/*
Then libraries will be added to classpath.
If you have no access to modify files on cluster you can try another approach. Add all log4j2 jars to spark-submit parameters using --jars. According to the documentation all these libries will be added to driver's and executor's classpath so it should work in the same way.
Try using the --driver-java-options
${SPARK_HOME}/bin/spark-submit \
--class \
--verbose \
--master 'local[*]' \
--files "log4j2.xml" \
--driver-java-options "-Dlog4j.configuration=log4j2.xml" \
--jars log4j-api-2.8.jar,log4j-core-2.8.jar,log4j-1.2-api-2.8.jar \
If log4j2 is being used in one of your own dependencies, it's quite easy to bipass all configuration files and use programmatic configuration for one or two high level loggers IF and only IF the configuration file is not found.
The code below does the trick. Just name the logger to your top level logger.
private static boolean configured = false;
private static void buildLog()
final LoggerContext ctx = (LoggerContext) LogManager.getContext(false);
System.out.println("Configuration found at "+ctx.getConfiguration().toString());
System.out.println("\n\n\nNo log4j2 config available. Configuring programmatically\n\n");
ConfigurationBuilder<BuiltConfiguration> builder = ConfigurationBuilderFactory
AppenderComponentBuilder appenderBuilder = builder.newAppender("Stdout", "CONSOLE")
.addAttribute("target", ConsoleAppender.Target.SYSTEM_OUT);
"%d [%t] %msg%n%throwable"));
LayoutComponentBuilder layoutBuilder = builder.newLayout("PatternLayout").addAttribute("pattern",
"%d [%t] %-5level: %msg%n");
appenderBuilder = builder.newAppender("file", "File").addAttribute("fileName", "./logs/ikoda.log")
builder.add(builder.newLogger("ikoda", Level.DEBUG)
.addAttribute("additivity", false));
((org.apache.logging.log4j.core.LoggerContext) LogManager.getContext(false)).start(;
System.out.println("Configuration file found.");
catch(Exception e)
System.out.println("\n\n\n\nFAILED TO CONFIGURE LOG4J2"+e.getMessage());

"No Filesystem for Scheme: gs" when running spark job locally

I am running a Spark job (version 1.2.0), and the input is a folder inside a Google Clous Storage bucket (i.e. gs://mybucket/folder)
When running the job locally on my Mac machine, I am getting the following error:
5932 [main] ERROR com.doit.customer.dataconverter.Phase1 - Job for date: 2014_09_23 failed with error: No FileSystem for scheme: gs
I know that 2 things need to be done in order for gs paths to be supported. One is install the GCS connector, and the other is have the following setup in core-site.xml of the Hadoop installation:
<description>The FileSystem for gs: (GCS) uris.</description>
The AbstractFileSystem for gs: (GCS) uris. Only necessary for use with Hadoop 2.
I think my problem comes from the fact I am not sure where exactly each piece need to be configured in this local mode. In the Intellij project, I am using Maven, and so I imported the spark library as follows:
<dependency> <!-- Spark dependency -->
<exclusion> <!-- declare the exclusion here -->
, and Hadoop 1.2.1 as follows:
The thing is, I am not sure where the hadoop location is configured for Spark, and also where the hadoop conf is configured. Therefore, I may be adding to the wrong Hadoop installation. In addition, is there something that needs to be restarted after modifying the files? As far as I saw, there is no Hadoop service running on my machine.
In Scala, add the following config when setting your hadoopConfiguration:
val conf = sc.hadoopConfiguration
conf.set("", "")
conf.set("", "")
There are a couple ways to help Spark pick up the relevant Hadoop configurations, both involving modifying ${SPARK_INSTALL_DIR}/conf:
Copy or symlink your ${HADOOP_HOME}/conf/core-site.xml into ${SPARK_INSTALL_DIR}/conf/core-site.xml. For example, when bdutil installs onto a VM, it runs:
ln -s ${HADOOP_CONF_DIR}/core-site.xml ${SPARK_INSTALL_DIR}/conf/core-site.xml
Older Spark docs explain that this makes the xml files included in Spark's classpath automatically:
Add an entry to ${SPARK_INSTALL_DIR}/conf/ with:
export HADOOP_CONF_DIR=/full/path/to/your/hadoop/conf/dir
Newer Spark docs seem to indicate this as the preferred method going forward:
I can't say what's wrong, but here's what I would try.
Try setting <property><name></name><value>my-little-project</value></property>
Print sc.hadoopConfiguration.get( to make sure your core-site.xml is getting loaded. Print it in the driver and also in the executor: println(x); rdd.foreachPartition { _ => println(x) }
Make sure the GCS jar is sent to the executors (sparkConf.setJars(...)). I don't think this would matter in local mode (it's all one JVM, right?) but you never know.
Nothing but your program needs to be restarted. There is no Hadoop process. In local and standalone modes Spark only uses Hadoop as a library, and only for IO I think.
You can apply these settings directly on the spark reader/writer as follows:
.option("", "")
.option("", "")
.option("", "true")
.option("", "<path-to-json-keyfile.json>")
.option("header", true)
.show(10, false)
And add the relevant jar dependency to your build.sbt (or whichever build tool you use) and check for latest:
"" % "gcs-connector" % "hadoop3-2.2.6" classifier "shaded"
See GCS Connector and Google Cloud Storage connector for non-dataproc clusters

How deploy maven3 artifact to remote server using scp

I want to have my own maven repository for artifacts created by myself but I have a problem trying to make a deploy of maven 3 artifact to a custom server. To explain this better I'm going to give some information:
I'm using Maven 3
I'm using Eclipse Keppler
I'm using Jenkins
The remote server is running Ubuntu Server 11.04
Jenkins is running on the Ubuntu server
My local machine is running Windows XP
My first attempt was with my machine. I run Maven in Eclipse to make the deploy, and everything works fine. I add the following to my projects pom
<id>my server id</id>
<name>my repository name</name>
<url>scpexe://my server//path/to/my/repository</url>
And in my settings.xml I add
<id>my server id</id>
<username>server username</username>
<password>server password</password>
So in my local machine it works, but I need to get this work using Jenkins. I modified the Jenkins settings.xml, because it runs on Linux, so doesn't need sshExecutable. The Jenkins settings.xml looks like
<id>my server id</id>
<username>server username</username>
<password>server password</password>
Then I modified the pom.xml to execute just scp and not scpexe
<id>my server id</id>
<name>my repository name</name>
<url>scp://my server//path/to/my/repository</url>
But according to this page maven 3 does not support scp. I run it any way and I got the following error message from Jenkins log
mavenExecutionResult exceptions not empty
message : Failed to execute goal org.apache.maven.plugins:maven-deploy-plugin:2.7:deploy (default-deploy) on project myproject: Failed to deploy artifacts/metadata: No connector available to access repository my_repository (scp://my server//path/to/my/repository) of type default using the available factories WagonRepositoryConnectorFactory
cause : Failed to deploy artifacts/metadata: No connector available to access repository my_repository (scp://my server//path/to/my/repository) of type default using the available factories WagonRepositoryConnectorFactory
Stack trace :
If I use scpexe instead of scp I got another error message
mavenExecutionResult exceptions not empty
message : Failed to execute goal org.apache.maven.plugins:maven-deploy-plugin:2.7:deploy (default-deploy) on project pruebanueva: Failed to deploy artifacts: Could not transfer artifact {$groupId}:{$artifactId}:{$package}:{$version} from/to my_repository (scpexe://my server//path/to/my/repository): Error executing command for transfer
cause : Failed to deploy artifacts: Could not transfer artifact {$groupId}:{$artifactId}:{$package}:{$version} from/to my_repository (scpexe://my server//path/to/my/repository): Error executing command for transfer
Stack trace :
The only way I could make deploy, was doing it in two steps
Configuring Jenkins to make just the install goal
Running the following command from command line
mvn deploy:deploy-file -DgroupId=$groupId -DartifactId=$artifactId
-Dversion=$version -Dpackaging=jar -Dfile=path/to/file.jar -Durl=scp://my server//path/to/my/repository -DrepositoryId=my repository id
I tried many things, including writing that command into Jenkins goal, but everytime I use the scp command in Jenkins the build fails.
Any idea of how to solve this issue will be appreciated.
I am interested to see if there's any real Maven solutions to this. I have always fixed this using the Maven Antrun plugin as follows:
<echo>deploying to server: ${deployment.server}</echo>
<taskdef classname="" name="scp" />
<scp file="${}/${project.artifactId}.war" password="${deployment.password}" todir="${deployment.userName}#${deployment.server}:" trust="true" verbose="true" />
<!-- <sshexec command="echo unity | sudo -S cp ${}.jar $( if [ -e /station ]; then echo /station/lib; else echo /opt/pkg-station*/webapps/station*/WEB-INF/lib; fi )" host="${targetStation}" password="unity" trust="true" username="wps"></sshexec> -->
A few notes on this: I activate this profile with a combination of running to the deploy phase, and providing a deployment.server setting. For my convenience then, I add the corresponding settings to my settings.xml so that I don't have to provide these all on the command-line every time:
I skip the actual deploy goal because it will be executed when I run to the deploy phase, which I don't want.
The Verhagen's answer is correct but a more pure maven solution is this:

Deploying an eclipse maven project in a remote linux server's tomcat

I'm looking a way to deploy a maven project developed using eclipse in a remote linux server's tomcat. I know you can export it as a .war file and dump it in CATALINA_HOME/webapps folder of the remote server. But for that you have to first export it to .war file and then copy the .war file in to remote server through SFTP or SCP. I'm looking for a way to do it with few clicks using eclipse or/and configuring some maven settings(in pom.xml or settings.xml). Does any one know how to do this? Any help is really appreciated.
The tool you are loooking for is called Tomcat Maven Plugin
What it basically does is it uses the API of Tomcat manager application, which you have to make sure is deployed on the Tomcat instance you are using. By default Tomcat manager should be available in the following location:
If it is not, please install it using the following command:
sudo apt-get install tomcat6-admin
You can configure the location of your Tomcat instance as follows:
and then run maven mvn tomcat:deploy goal. (Either from command line of from Eclipse using m2Eclipse plugin.)
Please refer to configuration and deployment pages of the plugin for more verbose information.
The most flexible solution with adapters for many different containers like Tomcat, Jetty, Glassfish, etc. is probably Maven Cargo plugin. You can find an extensive list of examples on their homepage, so no need to paste that here again.
To remotely deploy an application you'll need to configure the tomcat deployer app on the tomcat instance. Be warned, the configuration of admin users has undergone some subtle changes between tomcat 6 and 7.
Once this is working the Maven cargo plugin can deploy war files as follows:
Additional notes
The Cargo plugin supports several different containers, problem is the doco is difficult to interpret.
I haven't used the Maven plugin. It's very new
