Flink job with CassandrSink fails with Error writing - cassandra

I have two simple Flink streaming jobs that read from Kafka do some transformations and put the result into a Cassandra Sink. They read from different Kafka topics and save into different Cassandra tables.
When I run any one of the two jobs alone everything works fine. Checkpoints are triggered and completed and data is saved to Cassandra.
But when ever I run both jobs (or one of them twice) the second job fails at start up with this exception:
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1:9042 (com.datastax.driver.core.exceptions.TransportException: [localhost/127.0.0.1] Error writing)).
I could not find much info about this error, it may be caused by any one of the following:
Flink (v 1.10.0-scala_2.12),
Flink Cassandra Connector (flink-connector-cassandra_2.11:jar:1.10.2, also tried with flink-connector-cassandra_2.12:jar:1.10.0),
Datastax underlying driver (v 3.10.2),
Cassandra v4.0 (same with v3.0),
Netty transport (v 4.1.51.Final).
I also use packages that may have collisions with the first ones:
mysql-connector-java (v 8.0.19),
cassandra-driver-extras (v 3.10.2)
Finally this is my code for the cluster builder:
ClusterBuilder builder = new ClusterBuilder() {
#Override
protected Cluster buildCluster(Cluster.Builder builder) {
Cluster cluster = null;
try {
cluster = builder
.addContactPoint("localhost")
.withPort(9042)
.withClusterName("Test Cluster")
.withoutJMXReporting()
.withProtocolVersion(ProtocolVersion.V4)
.withoutMetrics()
.build();
// register codecs from datastax extras.
cluster.getConfiguration().getCodecRegistry()
.register(LocalTimeCodec.instance);
} catch (ConfigurationException e) {
e.printStackTrace();
} catch (NoHostAvailableException nhae) {
nhae.printStackTrace();
}
return cluster;
}
};
I tried with different PoolingOptions and SocketOptions settings but no success.
Cassandra Sink:
CassandraSink.addSink(dataRows)
.setQuery("insert into table_name_(16 columns names) " +
"values (16 placeholders);")
.enableWriteAheadLog()
.setClusterBuilder(builder)
.setFailureHandler(new CassandraFailureHandler() {
#Override
public void onFailure(Throwable throwable) {
LOG.error("A {} occurred.", "Cassandra Failure", throwable);
}
})
.build()
.setParallelism(1)
.name("Cassandra Sink For Unique Count every N minutes.");
The full trace log from flink job manager:
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1:9042 (com.datastax.driver.core.exceptions.TransportException: [localhost/127.0.0.1] Error writing))
at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:231)
at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:77)
at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1414)
at com.datastax.driver.core.Cluster.init(Cluster.java:162)
at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:333)
at com.datastax.driver.core.Cluster.connectAsync(Cluster.java:308)
at com.datastax.driver.core.Cluster.connect(Cluster.java:250)
at org.apache.flink.streaming.connectors.cassandra.CassandraSinkBase.createSession(CassandraSinkBase.java:143)
at org.apache.flink.streaming.connectors.cassandra.CassandraSinkBase.open(CassandraSinkBase.java:87)
at org.apache.flink.streaming.connectors.cassandra.AbstractCassandraTupleSink.open(AbstractCassandraTupleSink.java:49)
at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
at org.apache.flink.streaming.api.operators.StreamSink.open(StreamSink.java:48)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeStateAndOpen(StreamTask.java:1007)
at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:454)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:449)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:461)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
at java.base/java.lang.Thread.run(Thread.java:834)
Any help is appreciated.
Edit:
I just tried using two Cassandra separate instances (different machines and different clusters). I then pointed one job to an instance and the other job to the other instance. Nothing has changed, I still get the same error.
Tried to reduce dependencies, here is the new pom file:
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.abcde.ai</groupId>
<artifactId>analytics-etl</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<name>Flink Quickstart Job</name>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<flink.version>1.10.2</flink.version>
<java.version>1.8</java.version>
<scala.binary.version>2.11</scala.binary.version>
<maven.compiler.source>${java.version}</maven.compiler.source>
<maven.compiler.target>${java.version}</maven.compiler.target>
</properties>
<repositories>
<repository>
<id>apache.snapshots</id>
<name>Apache Development Snapshot Repository</name>
<url>https://repository.apache.org/content/repositories/snapshots/</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
<dependencies>
<!-- Apache Flink dependencies -->
<!-- These dependencies are provided, because they should not be packaged into the JAR file. -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_${scala.binary.version}</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_${scala.binary.version}</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.8.6</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka_${scala.binary.version}</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-cassandra_${scala.binary.version}</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>commons-configuration</groupId>
<artifactId>commons-configuration</artifactId>
<version>1.10</version>
</dependency>
<!-- Add logging framework, to produce console output when running in the IDE. -->
<!-- These dependencies are excluded from the application JAR by default. -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.7</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
<scope>runtime</scope>
</dependency>
</dependencies>
<build>
<plugins>
<!-- Java Compiler -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</configuration>
</plugin>
<!-- We use the maven-shade plugin to create a fat jar that contains all necessary dependencies. -->
<!-- Change the value of <mainClass>...</mainClass> if your program entry point changes. -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.1</version>
<executions>
<!-- Run shade goal on package phase -->
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<excludes>
<exclude>org.apache.flink:force-shading</exclude>
<exclude>com.google.code.findbugs:jsr305</exclude>
<exclude>org.slf4j:*</exclude>
<exclude>log4j:*</exclude>
</excludes>
</artifactSet>
<filters>
<filter>
<!-- Do not copy the signatures in the META-INF folder.
Otherwise, this might cause SecurityExceptions when using the JAR. -->
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.abcde.analytics.etl.KafkaUniqueCountsStreamingJob</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
<pluginManagement>
<plugins>
<!-- This improves the out-of-the-box experience in Eclipse by resolving some warnings. -->
<plugin>
<groupId>org.eclipse.m2e</groupId>
<artifactId>lifecycle-mapping</artifactId>
<version>1.0.0</version>
<configuration>
<lifecycleMappingMetadata>
<pluginExecutions>
<pluginExecution>
<pluginExecutionFilter>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<versionRange>[3.1.1,)</versionRange>
<goals>
<goal>shade</goal>
</goals>
</pluginExecutionFilter>
<action>
<ignore/>
</action>
</pluginExecution>
<pluginExecution>
<pluginExecutionFilter>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<versionRange>[3.1,)</versionRange>
<goals>
<goal>testCompile</goal>
<goal>compile</goal>
</goals>
</pluginExecutionFilter>
<action>
<ignore/>
</action>
</pluginExecution>
</pluginExecutions>
</lifecycleMappingMetadata>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
</project>
Edit:
I managed to narrow down the problem. The error gets fixed when I mark the dependency flink-connector-cassandra as provided and I simply copy the jar file from my local maven repository (~/.m2/repository/org/apache/flink/flink-connector-cassandra_2.11/1.10.2/flink-connector-cassandra_2.11-1.10.2.jar) to Flink lib folder. My problem is solved but the root cause is still a mystery.

I might be wrong, but most likely the issue is caused by netty client version conflict. The error states NoHostAvailableException, however the underlying error is TransportException with Error writing error message. Cassandra s definetely operating well.
There is a kind of similar stackoverflow case - Cassandra - error writing, with a very similar symptoms - a single project running well and AllNodesFailedException with TransportException with Error writing message as a root cause when adding one more. The author was able to solve it by unifying the netty client.
In your case, I'm not sure why there are so many dependencies, so I would try to get rid of all extras and libraries and would just leave Flink (v 1.10.0-scala_2.12) and Flink Cassandra Connector (flink-connector-cassandra_2.12:jar:1.10.0) libraries. They must already include necessary drivers, netty, etc. All other drivers should be skipped (at least for initial iteration to ensure that this solves the issue and it's library conflict).

To fix the error I marked the dependency flink-connector-cassandra as provided and I simply copy the jar file from my local maven repository (~/.m2/repository/org/apache/flink/flink-connector-cassandra_2.11/1.10.2/flink-connector-cassandra_2.11-1.10.2.jar) to Flink lib folder and restarted Flink, here is my new pom.xml file:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-cassandra_${scala.binary.version}</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
How I found this? I was about to try to compile the connector from source with a more recent driver version. First I tried to reproduce the error with the unchanged sources. So I compiled it without changing anything, put the jar into Flink lib folder, Hooray it works! I then suspected that the jar from maven had something different. I copied it into the lib folder and at my surprise it also worked.
My problem is solved but the root cause remains a mystery.
My last attempt was to check if any packages are in conflict with Cassandra connector so I run dependency:tree -Dverbose there was one conflict with org.apache.flink:flink-metrics-dropwizard about metrics-core:
[INFO] +- org.apache.flink:flink-connector-cassandra_2.12:jar:1.10.0:provided
[INFO] | +- (io.dropwizard.metrics:metrics-core:jar:3.1.2:provided - omitted for conflict with 3.1.5)
[INFO] | \- (org.apache.flink:force-shading:jar:1.10.0:provided - omitted for duplicate)
I removed this dependency from my project but the error remains if the connector is not marked as provided and also put in the lib folder.

Related

Azure Web Deployment Fails - Website with given name already exists

When deploying my application I receive the following error: Website with given name azuredemo already exists.
I am attempting to deploy a Java Spring Boot web application to Azure. I created a Spring Boot application that works on my local machine against port 80.
I then ran mvn azure-webapp:config and configured the Application and then ran config a second time to configure the Runtime.
I get the following output from maven:
AppName : azuredemo
ResourceGroup : azuredemo-rg
Region : eastus
PricingTier : Basic_B1
OS : Linux
Java : Java 11
Web server stack: Tomcat 9.0
Deploy to slot : false
Then I ran the command mvn azure-webapp:deploy to deploy the application to Azure.
[INFO] Scanning for projects...
[INFO]
[INFO] -----------------< demo:azuredemo >------------------
[INFO] Building azuredemo 0.0.1-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- azure-webapp-maven-plugin:1.12.0:deploy (default-cli) # azuredemo ---
[WARNING] The POM for com.microsoft.azure.applicationinsights.v2015_05_01:azure-mgmt-insights:jar:1.0.0-beta is invalid, transitive dependencies (if any) will not be available, enable debug logging for more details
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.microsoft.applicationinsights.core.dependencies.xstream.core.util.Fields (file:/C:/xxxxx/.m2/repository/com/microsoft/azure/applicationinsights-core/2.6.1/applicationinsights-core-2.6.1.jar) to field java.util.TreeMap.comparator
WARNING: Please consider reporting this to the maintainers of com.microsoft.applicationinsights.core.dependencies.xstream.core.util.Fields
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
[INFO] Auth Type : AZURE_CLI, Auth Files : [C:\xxxxx\.azure\azureProfile.json, C:\xxxxx\.azure\accessTokens.json]
[INFO] Subscription : Pay-As-You-Go(yyyyyyyy)
[INFO] Target Web App doesn't exist. Creating a new one...
[INFO] Creating App Service Plan 'ServicePlandaaaa-aaa'...
[INFO] Successfully created App Service Plan.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 30.596 s
[INFO] Finished at: 2021-02-03T15:02:13-05:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal com.microsoft.azure:azure-webapp-maven-plugin:1.12.0:deploy (default-cli) on project azuredemo: Website with given name azuredemo already exists.: OnError while emitting onNext value: retrofit2.Response.class -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
So to troubleshoot the issue, I delete all of the Application Services. I deleted all of the Resource Groups. And attempted the deployment again to the same error.
I understand I am probably missing something, but going through the portal and dashboards, I cannot find what the conflict is.
My Environment
Maven 3.6.3
Java 11
Spring Boot 2.3.8
az cli 2.18.0
Pay-As-You-Go subscription
My Maven POM
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.3.8.RELEASE</version>
<relativePath/>
<!-- lookup parent from repository -->
</parent>
<groupId>demo</groupId>
<artifactId>azuredemo</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>azuredemo</name>
<description>Demo project for Spring Boot</description>
<properties>
<java.version>11</java.version>
<azure.version>3.1.0</azure.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-thymeleaf</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>com.azure.spring</groupId>
<artifactId>azure-spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>com.azure.spring</groupId>
<artifactId>azure-spring-boot-starter-active-directory</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-devtools</artifactId>
<scope>runtime</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-configuration-processor</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>org.junit.vintage</groupId>
<artifactId>junit-vintage-engine</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>com.azure.spring</groupId>
<artifactId>azure-spring-boot-bom</artifactId>
<version>${azure.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<configuration>
<excludes>
<exclude>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-configuration-processor</artifactId>
</exclude>
<exclude>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
</exclude>
</excludes>
</configuration>
</plugin>
<plugin>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-webapp-maven-plugin</artifactId>
<version>1.12.0</version>
<configuration>
<authType>azure_cli</authType>
<resourceGroup>azuredemo-rg</resourceGroup>
<appName>azuredemo</appName>
<pricingTier>B1</pricingTier>
<region>eastus</region>
<deployment>
<resources>
<resource>
<directory>${project.basedir}/target</directory>
<includes>
<include>*.jar</include>
</includes>
</resource>
</resources>
</deployment>
<runtime>
<os>Linux</os>
<javaVersion>Java 11</javaVersion>
<webContainer>Tomcat 9.0</webContainer>
</runtime>
</configuration>
</plugin>
</plugins>
</build>
</project>
The name of Azure App Services are required to be globally unique. Presumably, someone else has already created a web app with the name 'azuredemo'. Try making the name more unique and publishing again.
In my case I wanted to update an app with a new deploy. If you get the error in this situation like I did, make sure that the resource group name matches the one that is already added in azure.

Maven SL4J multiple bindings, previous solutions fail

I have a project that has multiple SLF4J bindings.
I have read and tried the solutions in this SO post, this other SO post
and the slf4j website.
What I see if this when I run the code is
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/Mike/.m2/repository/org/jlab/coat/coat-libs/5.1-SNAPSHOT/coat-libs-5.1-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/Mike/.m2/repository/org/slf4j/slf4j-log4j12/1.7.16/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
However in my pom.xml file, I already have
<dependency>
<groupId>org.jlab.coat</groupId>
<artifactId>coat-libs</artifactId>
<version>5.1-SNAPSHOT</version>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
</dependency>
and in the mvn dependency:tree I do not see this dependency for the exclusion jar, only for the Spark jar, i.e.
[INFO] com.IKP:DCdatabase:jar:0.0.1-SNAPSHOT
[INFO] +- org.jlab.coat:coat-libs:jar:5.1-SNAPSHOT:compile
[INFO] +- org.apache.spark:spark-core_2.11:jar:2.2.1:compile
...
...
...
[INFO] | +- org.slf4j:slf4j-api:jar:1.7.16:compile
[INFO] | +- org.slf4j:jul-to-slf4j:jar:1.7.16:compile
[INFO] | +- org.slf4j:jcl-over-slf4j:jar:1.7.16:compile
[INFO] | +- log4j:log4j:jar:1.2.17:compile
[INFO] | +- org.slf4j:slf4j-log4j12:jar:1.7.16:compile
I have also tried a few more steps, such as clean out my .m2 directory, and build all sources from scratch, but I still see this double binding.
The nuance that occurs is that the log suppression for Spark does occur, i.e.
Logger.getLogger("org.apache.spark.SparkContext").setLevel(Level.WARN);
Logger.getLogger("org").setLevel(Level.OFF);
Logger.getLogger("akka").setLevel(Level.OFF);
No longer set the levels to off and I see all the levels.
Is there another way to remove this multiple binding for SLF4J?
It looks to me like coat-libs has been packaged as an uber jar which is why slf4j doesn't show up as a dependency when you do mvn dependency:tree. This explains why your exclusion isn't working.
I would recommend taking a look at the maven shade plugin for packaging your jar. Then, you can use a filter to exclude the slf4j dependencies from coat-libs using a filter.
For example, maybe something like:
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>org.jlab.coat:coat-libs</artifact>
<excludes>
<exclude>org/slf4j/**</exclude>
</excludes>
</filter>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
Just like the message says, you're getting one binding from coat-libs-5.1-SNAPSHOT.jar, and another binding in slf4j-log4j12-1.7.16.jar. It's not that "coat-libs" is trying to bring in a dependency that has a binding, it is a logging binding that's trying to handle SLF4J logging. You can only use one logging binding, so you either need to remove the usage of coat-libs, or you need to exclude slf4j-log4j12 from the dependencies for spark-core, depending on which logging framework you're actually trying to use.

Spark error with google/guava library: java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.refreshAfterWrite

I have a simple spark project - in which in the pom.xml the dependencies are only the basic scala, scalatest/junit, and spark:
<dependency>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.0</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-compiler</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.binary.version}</artifactId>
<version>3.0.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
<scope>compile</scope>
</dependency>
</dependencies>
When attempting to run a basic spark program the SparkSession init fails on this line:
SparkSession.builder.master(master).appName("sparkApp").getOrCreate
Here is the output / error:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/04/07 18:06:15 INFO SparkContext: Running Spark version 2.2.1
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder
.refreshAfterWrite(JLjava/util/concurrent/TimeUnit;)
Lcom/google/common/cache/CacheBuilder;
at org.apache.hadoop.security.Groups.<init>(Groups.java:96)
at org.apache.hadoop.security.Groups.<init>(Groups.java:73)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:293)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283)
at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260)
at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:789)
at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:774)
at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:647)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2424)
at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2424)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2424)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:295)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2516)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:918)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:910)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:910)
I have run spark locally many dozens of times on other projects, what might be wrong with this simple one? Is there a dependency on $HADOOP_HOME environment variable or similar?
Update By downgrading the spark version to 2.0.1 I was able to compile. That does not fix the problem (we need newer) version. But it helps point out the source of the problem
Another update In a different project the hack to downgrade to 2.0.1 does help - i.e. execution proceeds further : but then when writing out to parquet a similar exception does happen.
8/05/07 11:26:11 ERROR Executor: Exception in task 0.0 in stage 2741.0 (TID 2618)
java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.build(Lcom/google/common/cache/CacheLoader;)Lcom/google/common/cache/LoadingCache;
at org.apache.hadoop.io.compress.CodecPool.createCache(CodecPool.java:62)
at org.apache.hadoop.io.compress.CodecPool.<clinit>(CodecPool.java:74)
at org.apache.parquet.hadoop.CodecFactory$BytesCompressor.<init>(CodecFactory.java:92)
at org.apache.parquet.hadoop.CodecFactory.getCompressor(CodecFactory.java:169)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:303)
at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:262)
at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetFileFormat.scala:562)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:139)
at org.apache.spark.sql.execution.datasources.BaseWriterContainer.newOutputWriter(WriterContainer.scala:131)
at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:247)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(InsertIntoHadoopFsRelationCommand.scala:143)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
at org.apache.spark.scheduler.Task.run(Task.scala:86)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
This error occurs due to version mismatch between Google's guava library and Spark.
Spark shades guava but many libraries use guava. You can try Shading the Guava dependencies as per this post.
Apache-Spark-User-List
Adding shade plugin to your pom file and relocating google package can resolve this issue.
More information can found here and here
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<relocations>
<relocation>
<pattern>com.google.common</pattern>
<shadedPattern>shade.com.google.common</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
If this also doesn't help then adding guava library of version 15.0 works nicely. The reason of this work around is in dependencyManagement. The nice SO answer is here
<dependencyManagement>
<dependencies>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>15.0</version>
</dependency>
</dependencies>
</dependencyManagement>
I am getting this error in spring boot : java.lang.TypeNotPresentException: Type com.google.common.cache.CacheBuilderSpec
com.google.common.cache.CacheBuilder.build()Lcom/google/common/cache/Cache
The issue is due to "com.google.guava:guava" api. In springboot this api comes under some other api might be "spring-boot-starter-web" or "springfox-swagger2" api so we need to first exclude guava api from springfox-swagger2 jar and need to add updated version of guava api.spring-data-mongodb
Solution:
1. add guava dependency on the top of all the dependency so that springboot can ge the latest version:
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>19.0</version>
</dependency>
Find out the spring boot dependecy where artifactId: "guava" included then exlude "guava" artifact from that dependency and then add the guava dependency like above.

Kinesis Stream with Empty Records in Google Dataproc with Spark 1.6.1 Hadoop 2.7.2

I am trying to connect to Amazon Kinesis Stream from Google Dataproc but am only getting Empty RDDs.
Command: spark-submit --verbose --packages org.apache.spark:spark-streaming-kinesis-asl_2.10:1.6.2 demo_kinesis_streaming.py --awsAccessKeyId XXXXX --awsSecretKey XXXX
Detailed Log: https://gist.github.com/sshrestha-datalicious/e3fc8ebb4916f27735a97e9fcc42136c
More Details
Spark 1.6.1
Hadoop 2.7.2
Assembly Used: /usr/lib/spark/lib/spark-assembly-1.6.1-hadoop2.7.2.jar
Surprisingly that works when I download and use the assembly containing SPARK 1.6.1 with Hadoop 2.6.0 with the following command.
Command: SPARK_HOME=/opt/spark-1.6.1-bin-hadoop2.6 spark-submit --verbose --packages org.apache.spark:spark-streaming-kinesis-asl_2.10:1.6.2 demo_kinesis_streaming.py --awsAccessKeyId XXXXX --awsSecretKey XXXX
I am not sure if there is any version conflict between the two hadoop versions and Kinesis ASL or it has to do with custom settings with Google Dataproc.
Any help would be appreciated.
Thanks
Suren
Our team was in a similar situation and we managed to solve it:
We are running on the same environment:
DataProc Image Version 1 with Spark 1.6.1 with Hadoop 2.7
A simple SparkStream Kinesis Script that boils down to this:
# Run the script as
# spark-submit \
# --packages org.apache.spark:spark-streaming-kinesis-asl_2.10:1.6.1\
# demo_kinesis_streaming.py\
# --awsAccessKeyId FOO\
# --awsSecretKey BAR\
# ...
import argparse
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.storagelevel import StorageLevel
from pyspark.streaming.kinesis import KinesisUtils, InitialPositionInStream
ap = argparse.ArgumentParser()
ap.add_argument('--awsAccessKeyId', required=True)
ap.add_argument('--awsSecretKey', required=True)
ap.add_argument('--stream_name')
ap.add_argument('--region')
ap.add_argument('--app_name')
ap = ap.parse_args()
kinesis_application_name = ap.app_name
kinesis_stream_name = ap.stream_name
kinesis_region = ap.region
kinesis_endpoint_url = 'https://kinesis.{}.amazonaws.com'.format(ap.region)
spark_context = SparkContext(appName=kinesis_application_name)
streamingContext = StreamingContext(spark_context, 60)
kinesisStream = KinesisUtils.createStream(
ssc=streamingContext,
kinesisAppName=kinesis_application_name,
streamName=kinesis_stream_name,
endpointUrl=kinesis_endpoint_url,
regionName=kinesis_region,
initialPositionInStream=InitialPositionInStream.TRIM_HORIZON,
checkpointInterval=60,
storageLevel=StorageLevel.MEMORY_AND_DISK_2,
awsAccessKeyId=ap.awsAccessKeyId,
awsSecretKey=ap.awsSecretKey
)
kinesisStream.pprint()
streamingContext.start()
streamingContext.awaitTermination()
The code had been tested working on AWS EMR and on local environment using the same Spark 1.6.1 with Hadoop 2.7 setup.
The script is returning empty RDDs without printing any error while there is data in the Kinesis stream on DataProc.
We've tested it on DataProc with the following envs, and none of them worked.
Submit job via gcloud command;
ssh into Cluster Master Node and run in yarn client mode;
ssh into Cluster Master Node and run as local[*].
Upon enabling verbose logging by updating /etc/spark/conf/log4.properties with the following value:
log4j.rootCategory=DEBUG, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c: %m%n
log4j.logger.org.eclipse.jetty=ERROR
log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR
log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=DEBUG
log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=DEBUG
log4j.logger.org.apache.spark=DEBUG
log4j.logger.org.apache.hadoop.conf.Configuration.deprecation=DEBUG
log4j.logger.org.spark-project.jetty.server.handler.ContextHandler=DEBUG
log4j.logger.org.apache=DEBUG
log4j.logger.com.amazonaws=DEBUG
We've notice something weird in the log(Note that spark-streaming-kinesis-asl_2.10:1.6.1 uses aws-sdk-java/1.9.37 as dependence while somehow aws-sdk-java/1.7.4 was used [suggested by user-agent]):
16/07/10 06:30:16 DEBUG com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardConsumer: PROCESS task encountered execution exception:
java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: com.amazonaws.services.kinesis.model.GetRecordsResult.getMillisBehindLatest()Ljava/lang/Long;
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardConsumer.checkAndSubmitNextTask(ShardConsumer.java:137)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardConsumer.consumeShard(ShardConsumer.java:126)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker.run(Worker.java:334)
at org.apache.spark.streaming.kinesis.KinesisReceiver$$anon$1.run(KinesisReceiver.scala:174)
Caused by: java.lang.NoSuchMethodError: com.amazonaws.services.kinesis.model.GetRecordsResult.getMillisBehindLatest()Ljava/lang/Long;
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ProcessTask.call(ProcessTask.java:119)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:48)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:23)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
content-length:282
content-type:application/x-amz-json-1.1
host:kinesis.ap-southeast-2.amazonaws.com
user-agent:SparkDemo,amazon-kinesis-client-library-java-1.4.0, aws-sdk-java/1.7.4 Linux/3.16.0-4-amd64 OpenJDK_64-Bit_Server_VM/25.91-b14/1.8.0_91
x-amz-date:20160710T063016Z
x-amz-target:Kinesis_20131202.GetRecords
It appears that DataProc had build its own Spark with a much older AWS SDK as dependencies and it will blow up when used in conjunction with codes that requires much newer version of AWS SDK although we are not sure exactly which module had cause this error.
Update:
Base on #DennisHuo's comment, this behaviour is caused by Hadoop's leaky classpath:
https://github.com/apache/hadoop/blob/branch-2.7.2/hadoop-project/pom.xml#L650
To make things worst, the AWS KCL 1.4.0 (used by Spark 1.6.1) will suppress any runtime error silently instead of throwing RuntimeException and causing a lot of headache while debugging.
Eventually Our solution was to build our org.apache.spark:spark-streaming-kinesis-asl_2.10:1.6.1 with all of its com.amazonaws.* shaded.
Building the JAR with the following pom (update spark/extra/kinesis-asl/pom.xml) and shit the new JAR with --jars flag in spark-submit
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.10</artifactId>
<version>1.6.1</version>
<relativePath>../../pom.xml</relativePath>
</parent>
<!-- Kinesis integration is not included by default due to ASL-licensed code. -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kinesis-asl_2.10</artifactId>
<packaging>jar</packaging>
<name>Spark Kinesis Integration</name>
<properties>
<sbt.project.name>streaming-kinesis-asl</sbt.project.name>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>amazon-kinesis-client</artifactId>
<version>${aws.kinesis.client.version}</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>amazon-kinesis-producer</artifactId>
<version>${aws.kinesis.producer.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-core</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalacheck</groupId>
<artifactId>scalacheck_${scala.binary.version}</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-test-tags_${scala.binary.version}</artifactId>
</dependency>
</dependencies>
<build>
<outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>
<testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<configuration>
<shadedArtifactAttached>false</shadedArtifactAttached>
<artifactSet>
<includes>
<!-- At a minimum we must include this to force effective pom generation -->
<include>org.spark-project.spark:unused</include>
<include>com.amazonaws:*</include>
</includes>
</artifactSet>
<relocations>
<relocation>
<pattern>com.amazonaws</pattern>
<shadedPattern>foo.bar.YO.com.amazonaws</shadedPattern>
<includes>
<include>com.amazonaws.**</include>
</includes>
</relocation>
</relocations>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>

How to overcome maven org.eclipse.persistence JAR signature error

Im trying to add the modelgen jar to my maven project to enable automatic Metamodel class generation however I am getting the following error when I try to compile the project:
Failed to execute goal org.bsc.maven:maven-processor-plugin:2.2.4:process (process) on project Project-per: Error executing: java.lang.RuntimeException: java.lang.SecurityException: class "org.eclipse.persistence.internal.jpa.modelgen.CanonicalModelProperties"'s signer information does not match signer information of other classes in the same package
As best as I can determine there is most likely a miss match in the JAR signatures causing this. However I don't understand why or how to fix the problem.
My POM (Relevant parts only)
<dependency>
<groupId>org.eclipse.persistence</groupId>
<artifactId>eclipselink</artifactId>
<version>2.5.0-RC2</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.eclipse.persistence</groupId>
<artifactId>org.eclipse.persistence.nosql</artifactId>
<version>2.5.0-RC2</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>org.eclipse.persistence</groupId>
<artifactId>org.eclipse.persistence.jpa.modelgen</artifactId>
<version>2.4.2</version>
</dependency>
NB. Everything works fine with the nosql and the eclipselink, its just when I add the modelgen.
OK, as best I can tell I was right, the problem was caused by a signature difference in the JAR files. How this can happen on a library as popular as eclipselink is beyond me but hey.
My solution was to use a snapshot as they are generally not signed. Given that I was unable to obtain snapshot from central I used the following repo to obtain it;
<repository>
<id>sonatype-snapshots</id>
<url>https://oss.sonatype.org/content/repositories/snapshots/</url>
</repository>
<dependency>
<groupId>org.eclipse.persistence</groupId>
<artifactId>org.eclipse.persistence.jpa.modelgen</artifactId>
<version>2.6.0-SNAPSHOT</version>
</dependency>
Add following dependency into pom
<pluginExecution>
<pluginExecutionFilter>
<groupId>org.bsc.maven</groupId>
<artifactId>maven-processor-plugin</artifactId>
<versionRange>[2.2.4,)</versionRange>
<goals>
<goal>process</goal>
</goals>
</pluginExecutionFilter>
<action>
<ignore />
</action>
</pluginExecution>

Resources