Spark Unit Tests error on Window expression - apache-spark

I'm currently using unit tests with Spark under IntelliJ, DataFrameSuiteBase
and SharedSparkContext.
I'm under Windows and all is perfect as long as the Spark operations don't
use the org.apache.spark.sql.expressions.Window objet.
For instance :
val accu = anosGroupees.select($"col1", $"avg(col2)",
sum($"avg(col2)").over(Window.orderBy("col3").rowsBetween(Long.MinValue,
-1)).as("mycolumn"))
The error is :
Task not serializable
org.apache.spark.SparkException: Task not serializable
at
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:304)
at
org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:294)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2055)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:707)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:706)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:706)
at org.apache.spark.sql.execution.Window.doExecute(Window.scala:245)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
at
org.apache.spark.sql.execution.Project.doExecute(basicOperators.scala:46)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
at
org.apache.spark.sql.execution.Coalesce.doExecute(basicOperators.scala:250)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
at
org.apache.spark.sql.execution.Project.doExecute(basicOperators.scala:46)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
at
org.apache.spark.sql.execution.joins.BroadcastHashJoin$$anonfun$broadcastFuture$1$$anonfun$apply$1.apply(BroadcastHashJoin.scala:82)
at
org.apache.spark.sql.execution.joins.BroadcastHashJoin$$anonfun$broadcastFuture$1$$anonfun$apply$1.apply(BroadcastHashJoin.scala:79)
at
org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.scala:100)
at
org.apache.spark.sql.execution.joins.BroadcastHashJoin$$anonfun$broadcastFuture$1.apply(BroadcastHashJoin.scala:79)
at
org.apache.spark.sql.execution.joins.BroadcastHashJoin$$anonfun$broadcastFuture$1.apply(BroadcastHashJoin.scala:79)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException:
org.apache.hive.com.esotericsoftware.kryo.Kryo cannot be cast to
com.esotericsoftware.kryo.Kryo
at
org.apache.spark.sql.hive.HiveShim$HiveFunctionWrapper.serializePlan(HiveShim.scala:155)
at
org.apache.spark.sql.hive.HiveShim$HiveFunctionWrapper.writeExternal(HiveShim.scala:168)
at
java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1459)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1430)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1378)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
at
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44)
at
org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:101)
at
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301)
... 39 more
What do you think about that ? Is there any classpath problem ?
If I run this code on a yarn cluster, there's no problem. It's only occured
on Windows under a maven project. The dependencies in the pom are :
<dependencies>
<dependency>
<groupId>ai.h2o</groupId>
<artifactId>sparkling-water-core_2.10</artifactId>
<version>1.5.10</version>
</dependency>
<dependency>
<groupId>ai.h2o</groupId>
<artifactId>h2o-scala_2.10</artifactId>
<version>3.8.0.3</version>
</dependency>
<dependency>
<groupId>ai.h2o</groupId>
<artifactId>h2o-app</artifactId>
<version>3.8.0.3</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>15.0</version>
</dependency>
<dependency>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.2</version>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_2.10</artifactId>
<version>2.2.6</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>1.7.14</version>
</dependency>
<dependency>
<groupId>com.holdenkarau</groupId>
<artifactId>spark-testing-base_2.10</artifactId>
<version>1.5.2_0.3.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-hadoop</artifactId>
<version>2.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.10</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-csv_2.10</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-csv</artifactId>
<version>1.2</version>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>2.4</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-spark_2.10</artifactId>
<version>2.1.2</version>
</dependency>
</dependencies>
Regards,
Brice

The root cause is a CLassCastException: org.apache.hive.com.esotericsoftware.kryo.Kryo cannot be cast to
com.esotericsoftware.kryo.Kryo.
I have encountered the same problem in Java.
Hive-Exec jar is packaged with a Kryo implementation (shaded). In the Spark-Hive jar HiveShim class links to another Kryo class. A workaround in the UnitTest is to reassign the runtimeSerializationKryo:
// For JUnit4 you can use BeforeClass
#BeforeClass
public static void otherKryoImpl() {
// com.esotericsoftware.kryo.Kryo << Not the: ORG.APACHE.HIVE Shaded version.
// In the class HiveShim (spark-hive_2.10) the com.esotericsoftware.kryo.Kryo is expected
// org.apache.hadoop.hive.ql.exec.Utilities
Utilities.runtimeSerializationKryo = new ThreadLocal() { // Not typed to avoid compiler warning.
#Override
protected synchronized Kryo initialValue() {
Kryo kryo = new Kryo();
// copy stuff from original implementation...
return kryo;
};
};
}

i have encountered the same problem.
and i fix it :-)
it seams a classpath problem ;
just put spark lib to the first place of your extra driver(executor) path

Related

writing datafram to local path encountered error

Code works perfectly on my laptop, but encountered error while running on desktop
resultRDD.toDF().show()
resultRDD.toDF().write.format("delta").save(outputPath)
The results can be shown, but the write operation encountered below:
21/05/25 17:26:48 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
ExitCodeException exitCode=-1073741515:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
at org.apache.hadoop.util.Shell.run(Shell.java:482)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:869)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:852)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:398)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:461)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.parquet.hadoop.util.HadoopOutputFile.create(HadoopOutputFile.java:74)
at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:248)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:390)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:349)
at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetOutputWriter.scala:37)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:150)
at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.newOutputWriter(FileFormatDataWriter.scala:126)
at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.<init>(FileFormatDataWriter.scala:111)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:264)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$15(FileFormatWriter.scala:205)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:127)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:444)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:447)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
21/05/25 17:26:48 ERROR Executor: Exception in task 1.0 in stage 1.0 (TID 2)
ExitCodeException exitCode=-1073741515:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
at org.apache.hadoop.util.Shell.run(Shell.java:482)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:869)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:852)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:398)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:461)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.parquet.hadoop.util.HadoopOutputFile.create(HadoopOutputFile.java:74)
at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:248)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:390)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:349)
at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetOutputWriter.scala:37)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:150)
at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.newOutputWriter(FileFormatDataWriter.scala:126)
at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.<init>(FileFormatDataWriter.scala:111)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:264)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$15(FileFormatWriter.scala:205)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:127)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:444)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:447)
The maven pom.xml is:
<artifactId>spark-core</artifactId>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>io.delta</groupId>
<artifactId>delta-core_2.12</artifactId>
<version>0.8.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-yarn_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.10.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.alibaba/druid -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid</artifactId>
<version>1.1.10</version>
</dependency>
</dependencies>
and i tried putting winutil.exe and hadoop.dll into %HADOOP_HOME%/bin and windows/system32 which did not work
and i have sparkhome and hadoop home set up properly
I figured it out with the answer from: https://github.com/jleetutorial/scala-spark-tutorial/issues/2 gotta install both x86 and x64 vcredist 2010 to fix it

Setting up Spock test with Allure reports

I have a test that runs on Spock framework. I am trying to setup Allure reports with it. I don't see an example for spock integration here https://github.com/allure-examples. So i took the Junit5 maven based example,https://github.com/allure-examples/allure-junit5-maven, and trying to set it up. I modified the dependency from
<dependency>
<groupId>io.qameta.allure</groupId>
<artifactId>allure-junit5</artifactId>
<version>${allure.version}</version>
</dependency>
to
<dependency>
<groupId>io.qameta.allure</groupId>
<artifactId>allure-spock</artifactId>
<version>2.13.10</version>
</dependency>
since i am using spock here to run the tests.
Below is the pom i am using,
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>sample</groupId>
<version>0.0.1-SNAPSHOT</version>
<artifactId>sample</artifactId>
<name>sample-test</name>
<packaging>jar</packaging>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<java.version>1.8</java.version>
<spock.version>2.0-M5-groovy-3.0</spock.version>
<dbunit.version>2.5.1</dbunit.version>
<hamcrest.version>1.3</hamcrest.version>
<geb.version>0.13.1</geb.version>
<selenium.version>2.51.0</selenium.version>
<groovy.version>3.0.8</groovy.version>
</properties>
<repositories>
<!--other repositories if any-->
<repository>
<id>project.local</id>
<name>project</name>
<url>file:${project.basedir}/../repo</url>
</repository>
</repositories>
<dependencies>
<!--mandatory for the groovy CLI scripts -->
<dependency>
<groupId>org.mod4j.org.apache.commons</groupId>
<artifactId>cli</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>org.codehaus.groovy</groupId>
<artifactId>groovy-all</artifactId>
<version>${groovy.version}</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>com.github.jankroken</groupId>
<artifactId>commandline</artifactId>
<version>1.7.0</version>
</dependency>
<!-- Mandatory dependencies for using Spock -->
<dependency>
<groupId>org.spockframework</groupId>
<artifactId>spock-core</artifactId>
<version>${spock.version}</version>
<scope>test</scope>
</dependency>
<!-- Mandatory dependencies for tests with DB interaction -->
<dependency>
<groupId>org.dbunit</groupId>
<artifactId>dbunit</artifactId>
<version>${dbunit.version}</version>
<!-- not scoped for test since Import/export use this library -->
</dependency>
<dependency>
<groupId>com.oracle</groupId>
<artifactId>ojdbc7</artifactId>
<version>12.1.0.1</version>
</dependency>
<dependency>
<groupId>com.oracle</groupId>
<artifactId>xdb6</artifactId>
<version>12.1.0.1</version>
</dependency>
<dependency>
<groupId>commons-beanutils</groupId>
<artifactId>commons-beanutils</artifactId>
<version>1.4</version>
</dependency>
<dependency>
<groupId>org.jdom</groupId>
<artifactId>jdom</artifactId>
<version>1.1</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.16</version>
</dependency>
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<version>1.2</version>
</dependency>
<dependency>
<groupId>commons-collections</groupId>
<artifactId>commons-collections</artifactId>
<version>3.2.2</version>
</dependency>
<dependency>
<groupId>commons-lang</groupId>
<artifactId>commons-lang</artifactId>
<version>2.6</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.3.1</version>
</dependency>
<!-- JSON serialization/de-serialization library needed for JSONDataSet-->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.7.3</version>
</dependency>
<!-- h2databse library used to query CSV files with SQL -->
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<version>1.4.191</version>
</dependency>
<!-- Geb testing support -->
<dependency>
<groupId>org.gebish</groupId>
<artifactId>geb-spock</artifactId>
<version>${geb.version}</version>
<scope>test</scope>
</dependency>
<!-- httpcomponents upgrade to fix error in HTMLUnit-->
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5</version>
</dependency>
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-support</artifactId>
<version>${selenium.version}</version>
<scope>test</scope>
</dependency>
<!-- Selenium Web Driver Manager-->
<dependency>
<groupId>io.github.bonigarcia</groupId>
<artifactId>webdrivermanager</artifactId>
<version>1.4.5</version>
</dependency>
<!-- Hadoop dependencies -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.1</version>
</dependency>
<!-- Oozie dependencies -->
<dependency>
<groupId>org.apache.oozie</groupId>
<artifactId>oozie-client</artifactId>
<version>4.2.0</version>
</dependency>
<!-- Hive dependencies -->
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>2.1.0</version>
<exclusions>
<exclusion>
<artifactId>jdk.tools</artifactId>
<groupId>jdk.tools</groupId>
</exclusion>
</exclusions>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.maven.plugins/maven-surefire-report-plugin -->
<dependency>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-report-plugin</artifactId>
<version>3.0.0-M5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.maven.plugins/maven-site-plugin -->
<dependency>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-site-plugin</artifactId>
<version>3.9.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/io.qameta.allure/allure-spock -->
<dependency>
<groupId>io.qameta.allure</groupId>
<artifactId>allure-spock</artifactId>
<version>2.13.10</version>
</dependency>
<dependency>
<groupId>org.junit.platform</groupId>
<artifactId>junit-platform-surefire-provider</artifactId>
<version>1.3.2</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>1.7.30</version>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<version>5.8.0-M1</version>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-engine</artifactId>
<version>5.8.0-M1</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>${project.basedir}/src/main/groovy</sourceDirectory>
<testSourceDirectory>${project.basedir}/src/test/groovy</testSourceDirectory>
<resources>
<resource>
<directory>${project.basedir}/src/main/resources</directory>
</resource>
</resources>
<plugins>
<!-- Mandatory plugins for using Spock -->
<plugin>
<!-- The gmavenplus plugin is used to compile Groovy code. To learn more about this plugin,
visit https://github.com/groovy/GMavenPlus/wiki -->
<groupId>org.codehaus.gmavenplus</groupId>
<artifactId>gmavenplus-plugin</artifactId>
<version>1.12.1</version>
<executions>
<execution>
<goals>
<goal>addSources</goal>
<goal>addTestSources</goal>
<goal>compile</goal>
<goal>compileTests</goal>
</goals>
</execution>
</executions>
</plugin>
<!-- Optional plugins for using Spock -->
<!-- Only required if names of spec classes don't match default Surefire patterns (`*Test` etc.) -->
<plugin>
<artifactId>maven-surefire-plugin</artifactId>
<version>3.0.0-M5</version>
<configuration>
<useFile>false</useFile>
<argLine>
-Dfile.encoding=UTF-8
-javaagent:"${settings.localRepository}/org/aspectj/aspectjweaver/1.9.6/aspectjweaver-1.9.6.jar"
</argLine>
<includes>
<include>**/*Spec.groovy</include>
<include>**/*Spec.java</include>
<include>**/*Test.groovy</include>
<include>**/*Test.java</include>
</includes>
<systemPropertyVariables>
<geb.build.reportsDir>target/test-reports/geb</geb.build.reportsDir>
<allure.results.directory>${project.build.directory}/allure-results</allure.results.directory>
<junit.jupiter.extensions.autodetection.enabled>true</junit.jupiter.extensions.autodetection.enabled>
</systemPropertyVariables>
</configuration>
<dependencies>
<dependency>
<groupId>org.junit.platform</groupId>
<artifactId>junit-platform-surefire-provider</artifactId>
<version>1.3.2</version>
</dependency>
<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjweaver</artifactId>
<version>1.9.6</version>
</dependency>
</dependencies>
</plugin>
<plugin>
<groupId>io.qameta.allure</groupId>
<artifactId>allure-maven</artifactId>
<version>2.10.0</version>
<configuration>
<reportVersion>2.13.10</reportVersion>
<resultsDirectory>${project.build.directory}/allure-results</resultsDirectory>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-site-plugin</artifactId>
<version>3.9.1</version>
</plugin>
</plugins>
</build>
<reporting>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-report-plugin</artifactId>
<version>3.0.0-M5</version>
</plugin>
</plugins>
</reporting>
</project>
I am running a specific test class by using this command,
mvn -f pom.xml test -Dtest=CalcSpec -Dmaven.test.failure.ignore surefire-report:report
But i am getting this error while running it
[WARNING] Error injecting: org.apache.maven.plugin.surefire.SurefirePlugin
java.lang.NoClassDefFoundError: org/apache/maven/surefire/api/testset/TestSetFailedException
at java.lang.Class.getDeclaredConstructors0 (Native Method)
at java.lang.Class.privateGetDeclaredConstructors (Class.java:2671)
at java.lang.Class.getDeclaredConstructors (Class.java:2020)
at com.google.inject.spi.InjectionPoint.forConstructorOf (InjectionPoint.java:245)
at com.google.inject.internal.ConstructorBindingImpl.create (ConstructorBindingImpl.java:115)
at com.google.inject.internal.InjectorImpl.createUninitializedBinding (InjectorImpl.java:706)
at com.google.inject.internal.InjectorImpl.createJustInTimeBinding (InjectorImpl.java:930)
at com.google.inject.internal.InjectorImpl.createJustInTimeBindingRecursive (InjectorImpl.java:852)
at com.google.inject.internal.InjectorImpl.getJustInTimeBinding (InjectorImpl.java:291)
at com.google.inject.internal.InjectorImpl.getBindingOrThrow (InjectorImpl.java:222)
at com.google.inject.internal.InjectorImpl.getProviderOrThrow (InjectorImpl.java:1040)
at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1071)
at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1034)
at com.google.inject.internal.InjectorImpl.getInstance (InjectorImpl.java:1086)
at org.eclipse.sisu.space.AbstractDeferredClass.get (AbstractDeferredClass.java:48)
at com.google.inject.internal.ProviderInternalFactory.provision (ProviderInternalFactory.java:85)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.provision (InternalFactoryToInitializableAdapter.java:57)
at com.google.inject.internal.ProviderInternalFactory$1.call (ProviderInternalFactory.java:66)
at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:112)
at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:127)
at com.google.inject.internal.ProvisionListenerStackCallback.provision (ProvisionListenerStackCallback.java:66)
at com.google.inject.internal.ProviderInternalFactory.circularGet (ProviderInternalFactory.java:61)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.get (InternalFactoryToInitializableAdapter.java:47)
at com.google.inject.internal.InjectorImpl$1.get (InjectorImpl.java:1050)
at org.eclipse.sisu.inject.Guice4$1.get (Guice4.java:162)
at org.eclipse.sisu.inject.LazyBeanEntry.getValue (LazyBeanEntry.java:81)
at org.eclipse.sisu.plexus.LazyPlexusBean.getValue (LazyPlexusBean.java:51)
at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:263)
at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:255)
at org.apache.maven.plugin.internal.DefaultMavenPluginManager.getConfiguredMojo (DefaultMavenPluginManager.java:520)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:124)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:498)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: java.lang.ClassNotFoundException: org.apache.maven.surefire.api.testset.TestSetFailedException
at org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy.loadClass (SelfFirstStrategy.java:50)
at org.codehaus.plexus.classworlds.realm.ClassRealm.unsynchronizedLoadClass (ClassRealm.java:271)
at org.codehaus.plexus.classworlds.realm.ClassRealm.loadClass (ClassRealm.java:247)
at org.codehaus.plexus.classworlds.realm.ClassRealm.loadClass (ClassRealm.java:239)
at java.lang.Class.getDeclaredConstructors0 (Native Method)
at java.lang.Class.privateGetDeclaredConstructors (Class.java:2671)
at java.lang.Class.getDeclaredConstructors (Class.java:2020)
at com.google.inject.spi.InjectionPoint.forConstructorOf (InjectionPoint.java:245)
at com.google.inject.internal.ConstructorBindingImpl.create (ConstructorBindingImpl.java:115)
at com.google.inject.internal.InjectorImpl.createUninitializedBinding (InjectorImpl.java:706)
at com.google.inject.internal.InjectorImpl.createJustInTimeBinding (InjectorImpl.java:930)
at com.google.inject.internal.InjectorImpl.createJustInTimeBindingRecursive (InjectorImpl.java:852)
at com.google.inject.internal.InjectorImpl.getJustInTimeBinding (InjectorImpl.java:291)
at com.google.inject.internal.InjectorImpl.getBindingOrThrow (InjectorImpl.java:222)
at com.google.inject.internal.InjectorImpl.getProviderOrThrow (InjectorImpl.java:1040)
at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1071)
at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1034)
at com.google.inject.internal.InjectorImpl.getInstance (InjectorImpl.java:1086)
at org.eclipse.sisu.space.AbstractDeferredClass.get (AbstractDeferredClass.java:48)
at com.google.inject.internal.ProviderInternalFactory.provision (ProviderInternalFactory.java:85)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.provision (InternalFactoryToInitializableAdapter.java:57)
at com.google.inject.internal.ProviderInternalFactory$1.call (ProviderInternalFactory.java:66)
at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:112)
at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:127)
at com.google.inject.internal.ProvisionListenerStackCallback.provision (ProvisionListenerStackCallback.java:66)
at com.google.inject.internal.ProviderInternalFactory.circularGet (ProviderInternalFactory.java:61)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.get (InternalFactoryToInitializableAdapter.java:47)
at com.google.inject.internal.InjectorImpl$1.get (InjectorImpl.java:1050)
at org.eclipse.sisu.inject.Guice4$1.get (Guice4.java:162)
at org.eclipse.sisu.inject.LazyBeanEntry.getValue (LazyBeanEntry.java:81)
at org.eclipse.sisu.plexus.LazyPlexusBean.getValue (LazyPlexusBean.java:51)
at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:263)
at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:255)
at org.apache.maven.plugin.internal.DefaultMavenPluginManager.getConfiguredMojo (DefaultMavenPluginManager.java:520)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:124)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:498)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 22.275 s
[INFO] Finished at: 2021-05-14T07:51:58-04:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test (default-test) on project pic-test: Execution default-test of goal org.apache.maven.plugins:maven-surefire-plu
gin:3.0.0-M5:test failed: A required class was missing while executing org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test: org/apache/maven/surefire/api/testset/TestSetFailedException
[ERROR] -----------------------------------------------------
[ERROR] realm = plugin>org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5
[ERROR] strategy = org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
[ERROR] urls[0] = file:/C:/Users/tji/.m2/repository/org/apache/maven/plugins/maven-surefire-plugin/3.0.0-M5/maven-surefire-plugin-3.0.0-M5.jar
[ERROR] urls[1] = file:/C:/Users/tji/.m2/repository/org/junit/platform/junit-platform-surefire-provider/1.3.2/junit-platform-surefire-provider-1.3.2.jar
[ERROR] urls[2] = file:/C:/Users/tji/.m2/repository/org/apiguardian/apiguardian-api/1.0.0/apiguardian-api-1.0.0.jar
[ERROR] urls[3] = file:/C:/Users/tji/.m2/repository/org/junit/platform/junit-platform-launcher/1.3.2/junit-platform-launcher-1.3.2.jar
[ERROR] urls[4] = file:/C:/Users/tji/.m2/repository/org/junit/platform/junit-platform-engine/1.3.2/junit-platform-engine-1.3.2.jar
[ERROR] urls[5] = file:/C:/Users/tji/.m2/repository/org/junit/platform/junit-platform-commons/1.3.2/junit-platform-commons-1.3.2.jar
[ERROR] urls[6] = file:/C:/Users/tji/.m2/repository/org/opentest4j/opentest4j/1.1.1/opentest4j-1.1.1.jar
[ERROR] urls[7] = file:/C:/Users/tji/.m2/repository/org/apache/maven/surefire/surefire-api/2.22.0/surefire-api-2.22.0.jar
[ERROR] urls[8] = file:/C:/Users/tji/.m2/repository/org/apache/maven/surefire/surefire-logger-api/2.22.0/surefire-logger-api-2.22.0.jar
[ERROR] urls[9] = file:/C:/Users/tji/.m2/repository/org/apache/maven/surefire/common-java5/2.22.0/common-java5-2.22.0.jar
[ERROR] urls[10] = file:/C:/Users/tji/.m2/repository/org/aspectj/aspectjweaver/1.9.6/aspectjweaver-1.9.6.jar
[ERROR] urls[11] = file:/C:/Users/tji/.m2/repository/org/apache/maven/surefire/maven-surefire-common/3.0.0-M5/maven-surefire-common-3.0.0-M5.jar
[ERROR] urls[12] = file:/C:/Users/tji/.m2/repository/org/apache/maven/surefire/surefire-extensions-api/3.0.0-M5/surefire-extensions-api-3.0.0-M5.jar
[ERROR] urls[13] = file:/C:/Users/tji/.m2/repository/org/apache/maven/surefire/surefire-booter/3.0.0-M5/surefire-booter-3.0.0-M5.jar
[ERROR] urls[14] = file:/C:/Users/tji/.m2/repository/org/apache/maven/surefire/surefire-extensions-spi/3.0.0-M5/surefire-extensions-spi-3.0.0-M5.jar
[ERROR] urls[15] = file:/C:/Users/tji/.m2/repository/org/apache/maven/shared/maven-artifact-transfer/0.11.0/maven-artifact-transfer-0.11.0.jar
[ERROR] urls[16] = file:/C:/Users/tji/.m2/repository/org/apache/maven/shared/maven-common-artifact-filters/3.1.0/maven-common-artifact-filters-3.1.0.jar
[ERROR] urls[17] = file:/C:/Users/tji/.m2/repository/commons-codec/commons-codec/1.11/commons-codec-1.11.jar
[ERROR] urls[18] = file:/C:/Users/tji/.m2/repository/org/codehaus/plexus/plexus-java/1.0.5/plexus-java-1.0.5.jar
[ERROR] urls[19] = file:/C:/Users/tji/.m2/repository/org/ow2/asm/asm/7.2/asm-7.2.jar
[ERROR] urls[20] = file:/C:/Users/tji/.m2/repository/com/thoughtworks/qdox/qdox/2.0-M9/qdox-2.0-M9.jar
[ERROR] urls[21] = file:/C:/Users/tji/.m2/repository/org/apache/maven/surefire/surefire-shared-utils/3.0.0-M4/surefire-shared-utils-3.0.0-M4.jar
[ERROR] urls[22] = file:/C:/Users/tji/.m2/repository/org/codehaus/plexus/plexus-utils/1.1/plexus-utils-1.1.jar
[ERROR] Number of foreign imports: 1
[ERROR] import: Entry[import from realm ClassRealm[maven.api, parent: null]]
[ERROR]
[ERROR] -----------------------------------------------------
[ERROR] : org.apache.maven.surefire.api.testset.TestSetFailedException
What could be the issue here ? And is there any example out there for integrating Allure reports with Spock testing framework ?
You have to add surefire-api as an explicit dependency of the surefire plugin:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>3.0.0-M5</version>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.junit.platform/junit-platform-surefire-provider -->
<dependency>
<groupId>org.junit.platform</groupId>
<artifactId>junit-platform-surefire-provider</artifactId>
<version>1.3.2</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.maven.surefire/surefire-api -->
<dependency>
<groupId>org.apache.maven.surefire</groupId>
<artifactId>surefire-api</artifactId>
<version>3.0.0-M5</version>
</dependency>
...
</dependencies>
</plugin>

Spark Cassandra Java integration Issues

I am new to spark and Cassandra both.
I am trying to achieve aggregate functionality using spark+java on Cassandra Data.
I am not able to fetch the Cassandra data in my code. I read multiple discussions and found out that there are some compatibility issues with spark and spark-Cassandra connector. I tried a lot to fix my issue but was not able to fix it.
Find below pom.xml (kindly do not mind extra dependencies also. I need to make sure which library is causing the issue) -
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>IBeatCassPOC</groupId>
<artifactId>ibeatCassPOC</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<!--CASSANDRA START-->
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-mapping</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-extras</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>com.sparkjava</groupId>
<artifactId>spark-core</artifactId>
<version>2.5.4</version>
</dependency>
<!--https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.10-->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>2.0.0-M3</version>
</dependency>
<!--CASSANDRA END-->
<!-- Kafka -->
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.10</artifactId>
<version>0.8.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>0.8.2.1</version>
</dependency>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.2</version>
</dependency>
<!-- Spark -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.4.0</version>
</dependency>
<!-- Logging -->
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>2.1.0</version>
</dependency>
<!-- Spark-Kafka -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.10</artifactId>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.4.0</version>
</dependency>
<!-- Jackson -->
<dependency>
<groupId>org.codehaus.jackson</groupId>
<artifactId>jackson-mapper-asl</artifactId>
<version>1.9.13</version>
</dependency>
<!-- Google Collection Library -->
<dependency>
<groupId>com.google.collections</groupId>
<artifactId>google-collections</artifactId>
<version>1.0-rc2</version>
</dependency>
<!--UA Detector dependency for AgentType in PageTrendLog-->
<dependency>
<groupId>net.sf.uadetector</groupId>
<artifactId>uadetector-core</artifactId>
<version>0.9.12</version>
</dependency>
<dependency>
<groupId>net.sf.uadetector</groupId>
<artifactId>uadetector-resources</artifactId>
<version>2013.12</version>
</dependency>
<dependency>
<groupId>com.esotericsoftware</groupId>
<artifactId>kryo</artifactId>
<version>3.0.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>org.twitter4j</groupId>
<artifactId>twitter4j-stream</artifactId>
<version>4.0.4</version>
</dependency>
<!-- MongoDb Java Connector -->
<!-- <dependency> <groupId>org.mongodb</groupId> <artifactId>mongo-java-driver</artifactId>
<version>2.13.0</version> </dependency> -->
</dependencies>
Java source code being used to fetch the data -
import com.datastax.spark.connector.japi.CassandraJavaUtil;
import com.datastax.spark.connector.japi.CassandraRow;
import com.datastax.spark.connector.japi.rdd.CassandraJavaRDD;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.Function2;
import java.util.ArrayList;
public class ReadCassData {
public static void main(String[] args) {
SparkConf sparkConf = new SparkConf();
sparkConf.setAppName("Spark-Cassandra Integration");
sparkConf.setMaster("local[4]");
sparkConf.set("spark.cassandra.connection.host", "stagingServer22");
sparkConf.set("spark.cassandra.connection.port", "9042");
sparkConf.set("spark.cassandra.connection.timeout_ms", "5000");
sparkConf.set("spark.cassandra.read.timeout_ms", "200000");
JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf);
String keySpaceName = "testKeyspace";
String tableName = "testTable";
CassandraJavaRDD<CassandraRow> cassandraRDD = CassandraJavaUtil.javaFunctions(javaSparkContext).cassandraTable(keySpaceName, tableName);
System.out.println("Cassandra Count" + cassandraRDD.cassandraCount());
final ArrayList<CassandraRow> data = new ArrayList<CassandraRow>();
cassandraRDD.reduce(new Function2<CassandraRow, CassandraRow, CassandraRow>() {
public CassandraRow call(CassandraRow v1, CassandraRow v2) throws Exception {
System.out.println("hello");
System.out.println(v1 + " ____ " + v2);
data.add(v1);
data.add(v2);
return null;
}
});
System.out.println( "data Size -" + data.size());
}
}
Exception being encountered is -
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 1 times, most recent failure: Lost task 2.0 in stage 0.0 (TID 2, localhost): java.lang.NoSuchMethodError: org.apache.spark.TaskContext.getMetricsSources(Ljava/lang/String;)Lscala/collection/Seq;
at org.apache.spark.metrics.MetricsUpdater$.getSource(MetricsUpdater.scala:20)
at org.apache.spark.metrics.InputMetricsUpdater$.apply(InputMetricsUpdater.scala:56)
at com.datastax.spark.connector.rdd.CassandraTableScanRDD.compute(CassandraTableScanRDD.scala:329)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
at org.apache.spark.scheduler.Task.run(Task.scala:70)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1257)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1256)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1256)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1450)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
I have Cassandra cluster deployed on a remote location and Cassandra version being used is 3.9.
Please guide what are the compatible dependencies. I can not change my Cassandra version (currently 3.9). Please suggest what spark/spark-cassandra-connector version to use to successfully execute map-reduce jobs on DB.
I have tried with connecting with spark and have used spark cassandra connector in scala .
val spark = "com.datastax.spark" %% "spark-cassandra-connector" % "1.6.0"
val sparkCore = "org.apache.spark" %% "spark-sql" % "1.6.1"
And below is my working code -
import com.datastax.driver.dse.graph.GraphResultSet
import com.spok.util.LoggerUtil
import com.datastax.spark.connector._
import org.apache.spark._
object DseSparkGraphFactory extends App {
val dseConn = {
LoggerUtil.info("Connecting with DSE Spark Cluster....")
val conf = new SparkConf(true)
.setMaster("local[*]")
.setAppName("test")
.set("spark.cassandra.connection.host", "Ip-Address")
val sc = new SparkContext(conf)
val rdd = sc.cassandraTable("spokg_test", "Url_p")
rdd.collect().map(println)
}
Please refer to Cassandra Spark Connector for relevant version of connector depending on your spark version in your environment. It should be 1.5, 1.6 or 2.0
Following POM worked for me:
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.6.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.6.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.10</artifactId>
<version>1.6.2</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>jdk.tools</groupId>
<artifactId>jdk.tools</artifactId>
<version>1.6</version>
<scope>system</scope>
<systemPath>D:\Jars\tools-1.6.0.jar</systemPath>
</dependency>
</dependencies>
Check it.I successfully ingested streaming data from Kafka to Cassandra. Similarly u can pull data into javaRDD.

NoSuchElementException: key not found: 'int' with Spark Cassandra

I'm getting the following error while using Cassandra 3.0.5 and Scala 2.10:
Exception in thread "main" java.util.NoSuchElementException: key not found: 'int'
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at com.datastax.spark.connector.types.ColumnType$.fromDriverType(ColumnType.scala:81)
at com.datastax.spark.connector.cql.ColumnDef$.apply(Schema.scala:117)
at com.datastax.spark.connector.cql.Schema$$anonfun$com$datastax$spark$connector$cql$Schema$$fetchPartitionKey$1.apply(Schema.scala:199)
at com.datastax.spark.connector.cql.Schema$$anonfun$com$datastax$spark$connector$cql$Schema$$fetchPartitionKey$1.apply(Schema.scala:198)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.apply(TraversableLike.scala:722)
at scala.collection.immutable.HashSet$HashSet1.foreach(HashSet.scala:153)
at scala.collection.immutable.HashSet$HashTrieSet.foreach(HashSet.scala:306)
at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:721)
at com.datastax.spark.connector.cql.Schema$.com$datastax$spark$connector$cql$Schema$$fetchKeyspaces$1(Schema.scala:246)
Here are my Spark dependencies:
<!-- Spark dependancies -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.4.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.4.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.4.1</version>
</dependency>
<!-- Connectors -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.5.0-M3</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.5.0-M2</version>
</dependency>
And my Java code:
SparkConf conf = new SparkConf();
conf.setAppName("Java API demo");
conf.setMaster("local");
conf.set("spark.cassandra.connection.host", "localhost");
conf.set("spark.cassandra.connection.port", "9042");
conf.set("spark.cassandra.connection.timeout_ms", "40000");
conf.set("spark.cassandra.read.timeout_ms", "200000");
conf.set("spark.cassandra.auth.username", "username");
conf.set("spark.cassandra.auth.password", "password");
SimpleSpark app = new SimpleSpark(conf);
app.run();
I believe the versions I used were compatible; what is causing this error?
Please Update your com.datastax.spark connector driver to 1.5.0-RC1 instead of 1.5.0-M3. it is bug in 1.5.0-M3.
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.5.0-RC1</version>

Exception in thread "main" java.lang.NoClassDefFoundError: gherkin/formatter/Formatter

I am learning how to write BDD test scripts in JAVA using Cucumber. However, I keep getting the above error and not sure why. I have the Cukes Gherkin as a dependency.
POM
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>Cucumber</groupId>
<artifactId>Cucumber</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>gherkin</artifactId>
<version>1.2.4</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-junit</artifactId>
<version>1.2.4</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-jvm-deps</artifactId>
<version>1.0.5</version>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>com.thoughtworks.xstream</groupId>
<artifactId>xstream</artifactId>
</exclusion>
<exclusion>
<groupId>com.googlecode.java-diff-utils</groupId>
<artifactId>diffutils</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-core</artifactId>
<version>1.2.4</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-java</artifactId>
<version>1.2.4</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>info.cukes</groupId>
<artifactId>cucumber-picocontainer</artifactId>
<version>1.2.4</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.picocontainer</groupId>
<artifactId>picocontainer</artifactId>
<version>2.15</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>codehaus</id>
<url>http://repository.codehaus.org</url>
</repository>
</repositories>
<profiles>
<profile>
<id>junit-4.12</id>
<properties>
<junit.version>4.12</junit.version>
</properties>
</profile>
</profiles>
<build>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.3</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.5</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-clean-plugin</artifactId>
<version>2.6.1</version>
<configuration>
<filesets>
<fileset>
<directory>.</directory>
<includes>
<include>**/*.ser</include>
</includes>
</fileset>
</filesets>
</configuration>
</plugin>
</plugins>
</build>
</project>
Feature
Feature: Letter
Scenario: Check Letter
Given I have the letter "A"
When Icheck the letter "A"
Then I should see an output
Steps
package cucumber.steps;
import cucumber.api.CucumberOptions;
import cucumber.api.java.en.*;
import cucumber.api.junit.Cucumber;
import org.junit.Assert;
import org.junit.runner.RunWith;
/**
* Created by Dustin on 8/31/2015.
*/
#RunWith(Cucumber.class)
#CucumberOptions(
plugin = {"json-pretty", "html:target/cucumber"},
features = "src/main/java/cucumber/steps/LetterStepDefs"
)
public class LetterStepDefs {
private String letter;
private String message;
#Given("^I have the letter \"([^\"]*)\"$")
public void I_have_the_letter(String letter) throws Throwable {
// Express the Regexp above with the code you wish you had
this.letter = letter;
}
#When("^Icheck the letter \"([^\"]*)\"$")
public void Icheck_the_letter(String letter) throws Throwable {
// Express the Regexp above with the code you wish you had
try
{
Assert.assertEquals(this.letter, letter);
}
catch (Exception exc)
{
message = exc.getMessage();
}
}
#Then("^I should see an output$")
public void I_should_see_an_output() throws Throwable {
// Express the Regexp above with the code you wish you had
System.out.println(message);
}
}
Output
Testing started at 4:41 PM ...
Connected to the target VM, address: '127.0.0.1:58473', transport: 'socket'
JUnit version 4.12
Exception in thread "main" java.lang.NoClassDefFoundError: gherkin/formatter/Formatter
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.junit.internal.Classes.getClass(Classes.java:16)
at org.junit.runner.JUnitCommandLineParseResult.parseParameters(JUnitCommandLineParseResult.java:100)
at org.junit.runner.JUnitCommandLineParseResult.parseArgs(JUnitCommandLineParseResult.java:50)
at org.junit.runner.JUnitCommandLineParseResult.parse(JUnitCommandLineParseResult.java:44)
at org.junit.runner.JUnitCore.runMain(JUnitCore.java:72)
at org.junit.runner.JUnitCore.main(JUnitCore.java:36)
Caused by: java.lang.ClassNotFoundException: gherkin.formatter.Formatter
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 20 more
Disconnected from the target VM, address: '127.0.0.1:58473', transport: 'socket'
Process finished with exit code 1
Any help is much appreciated!
I was working with cucumber with some selenium scripts today and came across a similar issue whenever I was using gherkin3 jar file.
Once I switch back to using gherkin 2.12.2, the issue went away.
You can download the jar from the following location:
http://search.maven.org/#search%7Cga%7C1%7Cgherkin
It is certainly worth trying this and checking if you get the same issue.
I would also try running your feature file without any methods to check if you get it to return the methods you need to create, similar to the what is detailed in the document here:
http://www.toolsqa.com/cucumber/first-cucumber-selenium-java-test/
You don't need the glue option though like detailed in the example when you just what to run the feature file.

Resources