I am trying to Steam a Producer Topic Form Kafka. Getting the error that Kafka is not a valid data source
I imported all the required packages like Kafka SQL streaming etc.
BUILD.Gradle FILE
dependencies {
compile group: 'org.apache.kafka', name: 'kafka-clients', version: '2.2.0'
compile group: 'org.apache.kafka', name: 'kafka_2.12', version: '2.2.0'
compile group: 'org.scala-lang', name: 'scala-library', version: '2.12.8'
compile group: 'org.scala-lang', name: 'scala-reflect', version: '2.12.8'
compile group: 'org.scala-lang', name: 'scala-compiler', version: '2.12.8'
compile group: 'org.scala-lang.modules', name: 'scala-parser-combinators_2.12', version: '1.1.2'
compile group: 'org.scala-lang.modules', name: 'scala-swing_2.12', version: '2.1.1'
runtime group: 'org.apache.spark', name: 'spark-mllib_2.12', version: '2.4.3'
compile group: 'org.apache.spark', name: 'spark-core_2.12', version: '2.4.3'
compile 'org.apache.spark:spark-streaming-flume-assembly_2.11:2.1.0'
compile group: 'org.apache.spark', name: 'spark-sql_2.12', version: '2.4.3'
compile group: 'org.apache.spark', name: 'spark-graphx_2.12', version: '2.4.3'
compile group: 'org.apache.spark', name: 'spark-launcher_2.12', version: '2.4.3'
testCompile group: 'org.apache.spark', name: 'spark-catalyst_2.12', version: '2.4.3'
provided group: 'org.apache.spark', name: 'spark-streaming_2.12', version: '2.4.3'
provided group: 'org.apache.spark', name: 'spark-hive_2.12', version: '2.4.3'
compile group: 'org.apache.spark', name: 'spark-avro_2.12', version: '2.4.3'
compile group: 'com.databricks', name: 'spark-avro_2.11', version: '4.0.0'
compile group: 'io.confluent', name: 'kafka-avro-serializer', version: '3.1.1'
compile group: 'mysql', name: 'mysql-connector-java', version: '8.0.16'
compile group: 'org.apache.spark', name: 'spark-streaming-kafka_2.11', version: '1.6.3'
compile group: 'org.apache.spark', name: 'spark-streaming-kafka-0-10_2.12', version: '2.4.3'
provided group: 'org.apache.spark', name: 'spark-sql-kafka-0-10_2.12', version: '2.4.3'
}
CODE:
import com.util.SparkOpener
import org.apache.spark.streaming._
import org.apache.spark.streaming.kafka.KafkaUtils
import org.apache.spark.streaming.kafka010.{ConsumerStrategies, LocationStrategies}
object SparkConsumer extends SparkOpener
{
val spark=SparkSessionLoc("SparkKafkaStream")
spark.sparkContext.setLogLevel("ERROR")
def main(args: Array[String]): Unit = {
val Kafka_F1Topic=spark.readStream.format("kafka").option("kafka.bootstrap.servers", "localhost:9092,localhost:9093,localhost:9094").option("subscribe","F1CarDetails").option("key.serializer", "org.apache.kafka.common.serialization.StringSerializer").option("value.serializer", "org.apache.kafka.common.serialization.StringSerializer").load()
Kafka_F1Topic.show()
}
}
Result :
Exception in thread "main" org.apache.spark.sql.AnalysisException:
Failed to find data source: kafka. Please deploy the application as
per the deployment section of "Structured Streaming + Kafka
Integration Guide".;
Structured Streaming Guide also used the same format.
Related
dependencies {
compile('org.apache.hadoop:hadoop-azure:3.2.1'){
exclude group: "com.google.guava", module : "guava"
}
// Azure storage dependencies
compile group: 'com.azure', name: 'azure-storage-blob', version: '12.7.0'
// HBase
compile group: 'org.apache.hbase', name: 'hbase-client', version: '1.6.0'
compile group: 'io.projectreactor', name: 'reactor-core', version: '3.3.5.RELEASE' , force: true
compile group: 'io.projectreactor.netty', name: 'reactor-netty', version: '0.9.7.RELEASE', force: true
compile group: 'io.netty', name: 'netty-transport', version: '4.1.49.Final', force: true
compile (group: 'io.netty', name: 'netty-codec-http', version: '4.1.49.Final', force: true){
exclude group: 'io.netty', module: 'netty-codec'
}
compile group: 'io.netty', name: 'netty-common', version: '4.1.49.Final', force: true
compile group: 'io.netty', name: 'netty-handler', version: '4.1.49.Final', force: true
compile group: 'io.netty', name: 'netty-transport-native-epoll', version: '4.1.49.Final', force: true
compile group: 'io.netty', name: 'netty-resolver', version: '4.1.49.Final', force: true
compile group: 'io.netty', name: 'netty-buffer', version: '4.1.49.Final', force: true
compile group: 'io.netty', name: 'netty-transport-native-unix-common', version: '4.1.49.Final', force: true
compile group: 'io.netty', name: 'netty-codec', version: '4.1.49.Final', force: true
compile group: 'io.netty', name: 'netty-all', version: '4.1.49.Final', force: true
}
This is spark cluster dependency . I have removed netty versions. But still in databricks it fails. I checked the jar also, it contains handler.
dependencies {
// Spark dependency.
compile( group: 'org.apache.spark', name: 'spark-core_2.11', version: '2.4.5')
{
exclude group: "io.netty", module : "netty"
exclude group: "io.netty", module : "netty-all"
}
// Spark for SQL and parquet file.
compile group: 'org.apache.spark', name: 'spark-sql_2.11', version: '2.4.5'
compile group: 'com.esotericsoftware', name: 'kryo', version: '4.0.2'
compile 'org.apache.commons:commons-math3:3.6.1'
compile group: 'org.apache.commons', name: 'commons-text', version: '1.8'
compile group: 'org.codehaus.janino', name: 'janino', version: '3.1.2'
// Gson
compile group: 'com.google.code.gson', name: 'gson', version: '2.8.6'
// Java tuple for Pair.
compile group: 'org.javatuples', name: 'javatuples', version: '1.2'
// Lombok dependency
compileOnly 'org.projectlombok:lombok:1.18.12'
// Use JUnit test framework
testImplementation 'junit:junit:4.12'
}
Getting issue of could not initialise. Please let me know what I am missing . I can see in dependency tree that new version of netty handler is getting used
20/09/13 18:57:43 ERROR Schedulers: Scheduler worker in group main failed with an uncaught exception
java.lang.NoSuchMethodError: io.netty.handler.ssl.SslProvider.isAlpnSupported(Lio/netty/handler/ssl/SslProvider;)Z
at reactor.netty.http.client.HttpClientSecure.<clinit>(HttpClientSecure.java:79)
at reactor.netty.http.client.HttpClientConnect$MonoHttpConnect.lambda$subscribe$0(HttpClientConnect.java:301)
at reactor.core.publisher.MonoCreate.subscribe(MonoCreate.java:57)
at reactor.core.publisher.FluxRetryPredicate$RetryPredicateSubscriber.resubscribe(FluxRetryPredicate.java:124)
at reactor.core.publisher.MonoRetryPredicate.subscribeOrReturn(MonoRetryPredicate.java:51)
at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:57)
at reactor.netty.http.client.HttpClientConnect$MonoHttpConnect.subscribe(HttpClientConnect.java:326)
at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)
at reactor.core.publisher.InternalMonoOperator.subscribe(InternalMonoOperator.java:64)
at reactor.core.publisher.MonoDelaySubscription.accept(MonoDelaySubscription.java:52)
at reactor.core.publisher.MonoDelaySubscription.accept(MonoDelaySubscription.java:33)
at reactor.core.publisher.FluxDelaySubscription$DelaySubscriptionOtherSubscriber.onNext(FluxDelaySubscription.java:123)
at reactor.core.publisher.MonoDelay$MonoDelayRunnable.run(MonoDelay.java:117)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68)
at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
When I converted an application, that uses jaxb & java.mail, from java 8 to java 11 I had lots of module problems that seemed intractable.
My build.gradle included:
compile group: 'javax.xml.bind', name: 'jaxb-api', version: '2.3.1'
compile group: 'javax.mail', name: 'mail', version: '1.4.7'
The module errors I got included:
the unnamed module reads package javax.activation from both java.activation and activation
I tried this but it didn't work with the same error:
compile (group: 'javax.xml.bind', name: 'jaxb-api', version: '2.3.1') {
exclude group: 'javax.activation', module: 'activation'
}
compile group: 'javax.mail', name: 'mail', version: '1.4.7'
And I tried this which also failed:
compile group: 'javax.xml.bind', name: 'jaxb-api', version: '2.3.1'
compile (group: 'javax.mail', name: 'mail', version: '1.4.7') {
exclude group: 'javax.activation', module: 'activation'
}
The error was:
java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory
Also tried using in gradle with various exclude statements that all failed one way or another:
compile (group: 'javax.xml.bind', name: 'jaxb-api', version: '2.3.1')
compile (group: 'com.sun.xml.bind', name: 'jaxb-core', version: '2.3.0.1')
compile (group: 'com.sun.xml.bind', name: 'jaxb-impl', version: '2.3.0.1')
Also got runtime error:
class not found com.sun.activation.registries.LogSupport
Solution
In the end I found that using glassfish.jaxb and jakarta.mail worked.
My build.gradle included:
compile(group: 'org.glassfish.jaxb', name: 'jaxb-runtime', version: '2.3.2') {
exclude group: 'jakarta.activation', module: 'jakarta.activation-api'
}
compile group: 'com.sun.mail', name: 'jakarta.mail', version: '1.6.5'
My module-info.java included:
requires java.xml.bind;
requires jakarta.mail;
requires com.sun.xml.bind; // needed this for jlink
Hope this helps.
Getting the below error while reading the Excel data using Apache POI:
javax.xml.stream.FactoryConfigurationError: Provider
com.bea.xml.stream.EventFactory not found.
Below is the code I am using:
InputStream inputStream = new ByteArrayInputStream(byteCodeConversion);
ZipSecureFile.setMinInflateRatio(-1.0d);
XSSFWorkbook wb = new XSSFWorkbook(inputStream);//Throwing the error
Below are the dependencies I am using:
compile group: 'stax', name: 'stax-api', version: '1.0.1'
compile group: 'commons-codec', name: 'commons-codec', version: '1.9'
compile group: 'org.apache.poi', name: 'poi-ooxml', version: '3.15'
compile group: 'org.apache.poi', name: 'poi-ooxml-schemas', version: '3.15'
compile group: 'org.apache.poi', name: 'poi', version: '3.15'
compile group: 'dom4j', name: 'dom4j', version: '1.6.1'
compile group: 'stax', name: 'stax-api', version: '1.0.1'
compile group: 'org.apache.commons', name: 'commons-collections4', version: '4.1'
I wrote a Liferay module:
#Component(
immediate = true,
service = ModelListener.class
)
public class TopMessageListener extends BaseModelListener<JournalArticle> {
// Do stuff
}
with this bnd.bnd:
Bundle-SymbolicName: fr.free.nrw.impl
Bundle-Version: 1.0.0
Liferay-Require-SchemaVersion: 1.0.0
Import-Package: !org.apache.avalon.framework.logger, !org.apache.log \
*
And this in my build.gradle (among other things):
compileOnly group: "com.liferay.portal", name: "com.liferay.portal.kernel", version: "2.6.0"
compileInclude group: 'org.apache.httpcomponents', name: 'httpclient', version: '4.5.3'
It compiles fine, but deployment fails:
ClassNotFoundException: com.liferay.portal.kernel.model.BaseModelListener cannot be found by fr.free.nrw.impl_1.0.0
at org.eclipse.osgi.internal.loader.BundleLoader.findClassInternal(BundleLoader.java:444)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:357)
at org.eclipse.osgi.internal.loader.BundleLoader.findClass(BundleLoader.java:349)
at org.eclipse.osgi.internal.loader.ModuleClassLoader.loadClass(ModuleClassLoader.java:160)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
The error disappears if I replace compileOnly group: "com.liferay.portal" with compileInclude group: "com.liferay.portal", but I am sure it is not the correct way to do things. How to correctly fix the ClassNotFoundException above?
Please try this dependency
compileOnly group: "com.liferay.portal", name: "com.liferay.portal.kernel", version: "2.0.0"
Duplicate the httpcomponents line into two lines, one compileOnly and one runtime:
compileOnly group: "com.liferay.portal", name: "com.liferay.portal.kernel", version: "2.6.0"
compileOnly group: 'org.apache.httpcomponents', name: 'httpclient', version: '4.5.3'
runtime group: 'org.apache.httpcomponents', name: 'httpclient', version: '4.5.3'
That solved the problem for me.
I have added the jar named util-java.jar containing the above mentioned file into the build .gradle file. At compile time there is no error. But on executing the project I get a NoClassDefFoundError.
Please tell me how to eradicate this problem. My build.gradle is as follows:-
dependencies {
compile group: "biz.aQute.bnd", name: "biz.aQute.bndlib", version: "3.1.0"
compile group: "com.liferay", name: "com.liferay.osgi.util", version: "3.0.0"
compile group: "com.liferay", name: "com.liferay.portal.spring.extender", version: "2.0.0"
compile group: "com.liferay.portal", name: "com.liferay.portal.kernel", version: "2.6.0"
compile project(":modules:customuser:customuser-api")
compile group: 'com.liferay.portal', name: 'portal-kernel', version: '5.2.3'
runtime group: 'com.liferay.portal', name: 'portal-kernel', version: '5.2.3'
compile group: 'com.liferay.portal', name: 'util-java', version: '6.2.4'
runtime group: 'com.liferay.portal', name: 'util-java', version: '6.2.4'
}
Something like :
compileOnly group: "com.liferay", name: "com.liferay.portal.dao.orm.custom.sql", version: "1.0.5"
Should help you.
This a version issue, change
compile group: "biz.aQute.bnd", name: "biz.aQute.bndlib", version: "3.1.0"
to
compile group: "biz.aQute.bnd", name: "biz.aQute.bndlib", version: "3.5.0"