Why isn't my Eureka server being found by Hazelcast-Eureka-One plugin? - hazelcast

I have a working Hazelcast cluster configured with tcp-ip. I need it to work with Eureka discovery. I am trying to implement the hazelcast-eureka-one plugin.
The (Spring-Boot) app currently already registers itself with Eureka sucessfully, using the #EnableEurekaClient annotation. I am not concerned with whether the hazelcast eureka client is the same or a different client. I am fine with hazelcast registering itself separately from the app. As long as it works.
When I remove eureka-client.properties, the app will not start up, showing an error that eureka-client.properties can not be found. When I have the file in place, the app starts, but apparently none of the properties from eureka-client.properties are being loaded, which leaves hazelcast not knowing where the eureka server is. The logs indicate that the properties file is being found, but none of the properties seem to be imported.
Upgrading hazelcast-eureka-one to 1.1 makes no change.
Setting use-metadata-for-host-and-port to true makes no change.
Gradle:
compile group: 'com.hazelcast', name: 'hazelcast-spring', version: '3.9.4'
compile group: 'com.hazelcast', name: 'hazelcast-hibernate52', version: '1.2.3'
compile group: 'com.hazelcast', name: 'hazelcast-eureka-one', version: '1.0.1'
hazelcast.xml:
<?xml version="1.0" encoding="UTF-8"?>
<hazelcast xsi:schemaLocation="http://www.hazelcast.com/schema/config http://www.hazelcast.com/schema/config/hazelcast-config-3.9.xsd"
xmlns="http://www.hazelcast.com/schema/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<instance-name>app.name.hazelcast.sessions-instance</instance-name>
<group>
<name>app.name.hazelcast.sessions.local-group</name>
</group>
<network>
<join>
<multicast enabled="false"/>
<tcp-ip enabled="false"/>
<aws enabled="false"/>
<discovery-strategies>
<discovery-strategy class="com.hazelcast.eureka.one.EurekaOneDiscoveryStrategy" enabled="true">
<properties>
<property name="self-registration">true</property>
<property name="namespace">hazelcast-app-name</property>
<property name="use-metadata-for-host-and-port">false</property>
</properties>
</discovery-strategy>
</discovery-strategies>
</join>
</network>
<map name="spring:session:sessions">
<attributes>
<attribute extractor="org.springframework.session.hazelcast.PrincipalNameExtractor">principalName</attribute>
</attributes>
<indexes>
<index>principalName</index>
</indexes>
</map>
eureka-client.properties:
hazelcast.shouldUseDns=false
hazelcast.datacenter=primary
hazelcast.name=hazelcast-app-name-sessions
hazelcast.serviceUrl.default=http://username:password#svcregistry1-dev.company.com:8580/eureka/,http://username:password#svcregistry2-dev.company.com:8590/eureka/
Log file:
Loading 'hazelcast.xml' from classpath.
2019-02-15 11:19:13,935 - INFO - [localhost-startStop-1] - [,,] - com.hazelcast.instance.AddressPicker : [LOCAL] [app.name.hazelcast.sessions.local-group] [3.9.4] Prefer IPv4 stack is true.
2019-02-15 11:19:14,166 - INFO - [localhost-startStop-1] - [,,] - com.hazelcast.instance.AddressPicker : [LOCAL] [app.name.hazelcast.sessions.local-group] [3.9.4] Picked [172.28.208.1]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
2019-02-15 11:19:14,179 - INFO - [localhost-startStop-1] - [,,] - com.hazelcast.system : [172.28.208.1]:5701 [app.name.hazelcast.sessions.local-group] [3.9.4] Hazelcast 3.9.4 (20180420 - b8001d5) starting at [172.28.208.1]:5701
2019-02-15 11:19:14,179 - INFO - [localhost-startStop-1] - [,,] - com.hazelcast.system : [172.28.208.1]:5701 [app.name.hazelcast.sessions.local-group] [3.9.4] Copyright (c) 2008-2018, Hazelcast, Inc. All Rights Reserved.
2019-02-15 11:19:14,179 - INFO - [localhost-startStop-1] - [,,] - com.hazelcast.system : [172.28.208.1]:5701 [app.name.hazelcast.sessions.local-group] [3.9.4] Configured Hazelcast Serialization version: 1
2019-02-15 11:19:14,616 - INFO - [localhost-startStop-1] - [,,] - c.h.s.i.o.impl.BackpressureRegulator : [172.28.208.1]:5701 [app.name.hazelcast.sessions.local-group] [3.9.4] Backpressure is disabled
2019-02-15 11:19:15,309 - DEBUG - [localhost-startStop-1] - [,,] - .n.c.u.OverridingPropertiesConfiguration : Base path set to file:///C:/Users/my.name/IdeaProjects/AppName/build/classes/main/
2019-02-15 11:19:15,310 - DEBUG - [localhost-startStop-1] - [,,] - .n.c.u.OverridingPropertiesConfiguration : FileName set to eureka-client.properties
2019-02-15 11:19:15,310 - DEBUG - [localhost-startStop-1] - [,,] - .n.c.u.OverridingPropertiesConfiguration : URL set to file:/C:/Users/my.name/IdeaProjects/AppName/build/classes/main/eureka-client.properties
2019-02-15 11:19:15,316 - INFO - [localhost-startStop-1] - [,,] - c.n.config.util.ConfigurationUtils : Loaded properties file file:/C:/Users/my.name/IdeaProjects/AppName/build/classes/main/eureka-client.properties
2019-02-15 11:19:15,326 - INFO - [localhost-startStop-1] - [,,] - .p.EurekaConfigBasedInstanceInfoProvider : Setting initial instance status as: STARTING
2019-02-15 11:19:15,334 - WARN - [localhost-startStop-1] - [,,] - c.n.config.util.ConfigurationUtils : file:/C:/Users/my.name/IdeaProjects/AppName/build/classes/main/eureka-client.properties is already loaded
2019-02-15 11:19:15,385 - INFO - [localhost-startStop-1] - [,,] - com.netflix.discovery.DiscoveryClient : Initializing Eureka in region us-east-1
2019-02-15 11:19:15,951 - INFO - [localhost-startStop-1] - [,,] - c.n.d.provider.DiscoveryJerseyProvider : Using JSON encoding codec LegacyJacksonJson
2019-02-15 11:19:15,951 - INFO - [localhost-startStop-1] - [,,] - c.n.d.provider.DiscoveryJerseyProvider : Using JSON decoding codec LegacyJacksonJson
2019-02-15 11:19:16,143 - INFO - [localhost-startStop-1] - [,,] - c.n.d.provider.DiscoveryJerseyProvider : Using XML encoding codec XStreamXml
2019-02-15 11:19:16,143 - INFO - [localhost-startStop-1] - [,,] - c.n.d.provider.DiscoveryJerseyProvider : Using XML decoding codec XStreamXml
2019-02-15 11:19:16,392 - INFO - [localhost-startStop-1] - [,,] - c.n.d.s.r.aws.ConfigClusterResolver : Resolving eureka endpoints via configuration
2019-02-15 11:19:16,394 - DEBUG - [localhost-startStop-1] - [,,] - c.n.discovery.endpoint.EndpointUtils : The availability zone for the given region us-east-1 are [defaultZone]
2019-02-15 11:19:16,394 - DEBUG - [localhost-startStop-1] - [,,] - c.n.d.s.r.aws.ConfigClusterResolver : Config resolved to []
2019-02-15 11:19:16,394 - ERROR - [localhost-startStop-1] - [,,] - c.n.d.s.r.aws.ConfigClusterResolver : Cannot resolve to any endpoints from provided configuration: {defaultZone=[]}
2019-02-15 11:19:16,612 - DEBUG - [localhost-startStop-1] - [,,] - c.n.d.s.r.a.ZoneAffinityClusterResolver : Local zone=defaultZone; resolved to: []
2019-02-15 11:19:16,612 - ERROR - [localhost-startStop-1] - [,,] - c.n.d.s.transport.EurekaHttpClients : Initial resolution of Eureka server endpoints failed. Check ConfigClusterResolver logs for more info
2019-02-15 11:19:16,647 - INFO - [localhost-startStop-1] - [,,] - com.netflix.discovery.DiscoveryClient : Disable delta property : false
2019-02-15 11:19:16,647 - INFO - [localhost-startStop-1] - [,,] - com.netflix.discovery.DiscoveryClient : Single vip registry refresh property : null
2019-02-15 11:19:16,647 - INFO - [localhost-startStop-1] - [,,] - com.netflix.discovery.DiscoveryClient : Force full registry fetch : false
2019-02-15 11:19:16,647 - INFO - [localhost-startStop-1] - [,,] - com.netflix.discovery.DiscoveryClient : Application is null : false
2019-02-15 11:19:16,647 - INFO - [localhost-startStop-1] - [,,] - com.netflix.discovery.DiscoveryClient : Registered Applications size is zero : true
2019-02-15 11:19:16,647 - INFO - [localhost-startStop-1] - [,,] - com.netflix.discovery.DiscoveryClient : Application version is -1: true
2019-02-15 11:19:16,647 - INFO - [localhost-startStop-1] - [,,] - com.netflix.discovery.DiscoveryClient : Getting all instance registry info from the eureka server
2019-02-15 11:19:16,648 - DEBUG - [localhost-startStop-1] - [,,] - c.n.d.s.t.d.SessionedEurekaHttpClient : Ending a session and starting anew
2019-02-15 11:19:16,655 - ERROR - [localhost-startStop-1] - [,,] - com.netflix.discovery.DiscoveryClient : DiscoveryClient_UNKNOWN/0c99d08b-8072-4fe4-a20f-c8653e10e374 - was unable to refresh its cache! status = There is no known eureka server; cluster server list is empty
com.netflix.discovery.shared.transport.TransportException: There is no known eureka server; cluster server list is empty
at com.netflix.discovery.shared.transport.decorator.RetryableEurekaHttpClient.execute(RetryableEurekaHttpClient.java:108)
at com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator.getApplications(EurekaHttpClientDecorator.java:134)
at com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator$6.execute(EurekaHttpClientDecorator.java:137)
at com.netflix.discovery.shared.transport.decorator.SessionedEurekaHttpClient.execute(SessionedEurekaHttpClient.java:77)
at com.netflix.discovery.shared.transport.decorator.EurekaHttpClientDecorator.getApplications(EurekaHttpClientDecorator.java:134)
at com.netflix.discovery.DiscoveryClient.getAndStoreFullRegistry(DiscoveryClient.java:1051)
at com.netflix.discovery.DiscoveryClient.fetchRegistry(DiscoveryClient.java:965)
at com.netflix.discovery.DiscoveryClient.<init>(DiscoveryClient.java:414)
at com.netflix.discovery.DiscoveryClient.<init>(DiscoveryClient.java:269)
at com.netflix.discovery.DiscoveryClient.<init>(DiscoveryClient.java:265)
at com.netflix.discovery.DiscoveryClient.<init>(DiscoveryClient.java:257)
at com.hazelcast.eureka.one.EurekaOneDiscoveryStrategy.<init>(EurekaOneDiscoveryStrategy.java:147)
at com.hazelcast.eureka.one.EurekaOneDiscoveryStrategy.<init>(EurekaOneDiscoveryStrategy.java:55)
at com.hazelcast.eureka.one.EurekaOneDiscoveryStrategy$EurekaOneDiscoveryStrategyBuilder.build(EurekaOneDiscoveryStrategy.java:111)
at com.hazelcast.eureka.one.EurekaOneDiscoveryStrategyFactory.newDiscoveryStrategy(EurekaOneDiscoveryStrategyFactory.java:53)
at com.hazelcast.spi.discovery.impl.DefaultDiscoveryService.buildDiscoveryStrategy(DefaultDiscoveryService.java:185)
at com.hazelcast.spi.discovery.impl.DefaultDiscoveryService.loadDiscoveryStrategies(DefaultDiscoveryService.java:145)
at com.hazelcast.spi.discovery.impl.DefaultDiscoveryService.<init>(DefaultDiscoveryService.java:60)
at com.hazelcast.spi.discovery.impl.DefaultDiscoveryServiceProvider.newDiscoveryService(DefaultDiscoveryServiceProvider.java:29)
at com.hazelcast.instance.Node.createDiscoveryService(Node.java:265)
at com.hazelcast.instance.Node.<init>(Node.java:216)
at com.hazelcast.instance.HazelcastInstanceImpl.createNode(HazelcastInstanceImpl.java:160)
at com.hazelcast.instance.HazelcastInstanceImpl.<init>(HazelcastInstanceImpl.java:128)
at com.hazelcast.instance.HazelcastInstanceFactory.constructHazelcastInstance(HazelcastInstanceFactory.java:195)
at com.hazelcast.instance.HazelcastInstanceFactory.newHazelcastInstance(HazelcastInstanceFactory.java:174)
at com.hazelcast.instance.HazelcastInstanceFactory.newHazelcastInstance(HazelcastInstanceFactory.java:124)
at com.hazelcast.core.Hazelcast.newHazelcastInstance(Hazelcast.java:92)
at org.springframework.boot.autoconfigure.hazelcast.HazelcastServerConfiguration$HazelcastServerConfigFileConfiguration.hazelcastInstance(HazelcastServerConfiguration.java:56)
at org.springframework.boot.autoconfigure.hazelcast.HazelcastServerConfiguration$HazelcastServerConfigFileConfiguration$$EnhancerBySpringCGLIB$$d6cfebe6.CGLIB$hazelcastInstance$0(<generated>)
at org.springframework.boot.autoconfigure.hazelcast.HazelcastServerConfiguration$HazelcastServerConfigFileConfiguration$$EnhancerBySpringCGLIB$$d6cfebe6$$FastClassBySpringCGLIB$$3a3e2869.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228)
at org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(ConfigurationClassEnhancer.java:365)
at org.springframework.boot.autoconfigure.hazelcast.HazelcastServerConfiguration$HazelcastServerConfigFileConfiguration$$EnhancerBySpringCGLIB$$d6cfebe6.hazelcastInstance(<generated>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154)
at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:583)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1246)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1096)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:535)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:495)
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:317)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:315)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:199)
at org.springframework.beans.factory.config.DependencyDescriptor.resolveCandidate(DependencyDescriptor.java:251)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.doResolveDependency(DefaultListableBeanFactory.java:1135)
at org.springframework.beans.factory.support.DefaultListableBeanFactory$DependencyObjectProvider.getObject(DefaultListableBeanFactory.java:1665)
at org.springframework.session.hazelcast.config.annotation.web.http.HazelcastHttpSessionConfiguration.setHazelcastInstance(HazelcastHttpSessionConfiguration.java:96)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredMethodElement.inject(AutowiredAnnotationBeanPostProcessor.java:696)
at org.springframework.beans.factory.annotation.InjectionMetadata.inject(InjectionMetadata.java:90)
at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessPropertyValues(AutowiredAnnotationBeanPostProcessor.java:370)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1336)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:572)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:495)
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:317)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:315)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:199)
at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:373)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1246)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1096)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:535)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:495)
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:317)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:315)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:199)
at org.springframework.beans.factory.config.DependencyDescriptor.resolveCandidate(DependencyDescriptor.java:251)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.doResolveDependency(DefaultListableBeanFactory.java:1135)
at org.springframework.beans.factory.support.DefaultListableBeanFactory.resolveDependency(DefaultListableBeanFactory.java:1062)
at org.springframework.beans.factory.support.ConstructorResolver.resolveAutowiredArgument(ConstructorResolver.java:819)
at org.springframework.beans.factory.support.ConstructorResolver.createArgumentArray(ConstructorResolver.java:725)
at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:475)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1246)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1096)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:535)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:495)
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:317)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:315)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:204)
at org.springframework.boot.web.servlet.ServletContextInitializerBeans.getOrderedBeansOfType(ServletContextInitializerBeans.java:226)
at org.springframework.boot.web.servlet.ServletContextInitializerBeans.getOrderedBeansOfType(ServletContextInitializerBeans.java:214)
at org.springframework.boot.web.servlet.ServletContextInitializerBeans.addServletContextInitializerBeans(ServletContextInitializerBeans.java:91)
at org.springframework.boot.web.servlet.ServletContextInitializerBeans.<init>(ServletContextInitializerBeans.java:80)
at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.getServletContextInitializerBeans(ServletWebServerApplicationContext.java:250)
at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.selfInitialize(ServletWebServerApplicationContext.java:237)
at org.springframework.boot.web.embedded.tomcat.TomcatStarter.onStartup(TomcatStarter.java:54)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5245)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1420)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1410)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Your namespace in hazelcast.xml must be the same as the prefix of the properties in eureka-client.properties.
In other words, you need to either change the namespace to:
<property name="namespace">hazelcast</property>
Or change your eureka-client.properties to:
hazelcast-app-name.shouldUseDns=false
hazelcast-app-name.datacenter=primary
hazelcast-app-name.name=hazelcast-app-name-sessions
hazelcast-app-name.serviceUrl.default=http://username:password#svcregistry1-dev.company.com:8580/eureka/,http://username:password#svcregistry2-dev.company.com:8590/eureka/
Please read more at:
Hazelcast Eureka Plugin GH repository
Hazelcast Eureka Plugin Code Sample

Related

Flink Job Deployment In Kubernetes

I am trying to deploy a Flink job in Kubernetes cluster (Azure AKS). The Job Cluster is getting aborted just after starting but Task manager is running fine.
The docker image is created successfully without any exception. I am able to run the docker image as well as able to SSH to docker image.
I have followed steps mentioned in the below link:
https://github.com/apache/flink/tree/release-1.9/flink-container/kubernetes
While creating image I have provided Job jar and it has been copied on "/opt/artifacts" inside the image. But still not getting why getting below exception in Job Cluster pod log.
Caused by: org.apache.flink.util.FlinkException: Failed to find job JAR on class path. Please provide the job class name explicitly.
I am new in Kubernetes, Could you please give me some clue to debug this issue.
Please find below complete logs:
A. flink-job-cluster Pod Log
develk#ACIDLAELKV01:~/cntx_eng$ kubectl logs flink-job-cluster-kszwf
Starting the job-cluster
Starting standalonejob as a console application on host flink-job-cluster-kszwf.
2019-12-12 10:37:17,170 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - --------------------------------------------------------------------------------
2019-12-12 10:37:17,172 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting StandaloneJobClusterEntryPoint (Version: 1.8.0, Rev:4caec0d, Date:03.04.2019 # 13:25:54 PDT)
2019-12-12 10:37:17,172 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - OS current user: flink
2019-12-12 10:37:17,173 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Current Hadoop/Kerberos user: <no hadoop dependency found>
2019-12-12 10:37:17,173 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM: OpenJDK 64-Bit Server VM - IcedTea - 1.8/25.212-b04
2019-12-12 10:37:17,173 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Maximum heap size: 989 MiBytes
2019-12-12 10:37:17,173 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JAVA_HOME: /usr/lib/jvm/java-1.8-openjdk/jre
2019-12-12 10:37:17,174 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - No Hadoop Dependency available
2019-12-12 10:37:17,174 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM Options:
2019-12-12 10:37:17,174 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xms1024m
2019-12-12 10:37:17,174 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xmx1024m
2019-12-12 10:37:17,174 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Dlog4j.configuration=file:/opt/flink-1.8.0/conf/log4j-console.properties
2019-12-12 10:37:17,175 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Dlogback.configurationFile=file:/opt/flink-1.8.0/conf/logback-console.xml
2019-12-12 10:37:17,175 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Program Arguments:
2019-12-12 10:37:17,175 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - --configDir
2019-12-12 10:37:17,175 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - /opt/flink-1.8.0/conf
2019-12-12 10:37:17,175 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Djobmanager.rpc.address=flink-job-cluster
2019-12-12 10:37:17,175 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Dparallelism.default=1
2019-12-12 10:37:17,176 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Dblob.server.port=6124
2019-12-12 10:37:17,176 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Dqueryable-state.server.ports=6125
2019-12-12 10:37:17,176 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Classpath: /opt/flink-1.8.0/lib/log4j-1.2.17.jar:/opt/flink-1.8.0/lib/slf4j-log4j12-1.7.15.jar:/opt/flink-1.8.0/lib/flink-dist_2.11-1.8.0.jar:::
2019-12-12 10:37:17,176 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - --------------------------------------------------------------------------------
2019-12-12 10:37:17,178 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Registered UNIX signal handlers for [TERM, HUP, INT]
2019-12-12 10:37:17,306 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, localhost
2019-12-12 10:37:17,306 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123
2019-12-12 10:37:17,307 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.size, 1024m
2019-12-12 10:37:17,307 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.size, 1024m
2019-12-12 10:37:17,307 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2019-12-12 10:37:17,307 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1
2019-12-12 10:37:17,336 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting StandaloneJobClusterEntryPoint.
2019-12-12 10:37:17,336 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install default filesystem.
2019-12-12 10:37:17,343 INFO org.apache.flink.core.fs.FileSystem - Hadoop is not in the classpath/dependencies. The extended set of supported File Systems via Hadoop is not available.
2019-12-12 10:37:17,352 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install security context.
2019-12-12 10:37:17,362 INFO org.apache.flink.runtime.security.modules.HadoopModuleFactory - Cannot create Hadoop Security Module because Hadoop cannot be found in the Classpath.
2019-12-12 10:37:17,381 INFO org.apache.flink.runtime.security.SecurityUtils - Cannot install HadoopSecurityContext because Hadoop cannot be found in the Classpath.
2019-12-12 10:37:17,382 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Initializing cluster services.
2019-12-12 10:37:17,638 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Trying to start actor system at flink-job-cluster:6123
2019-12-12 10:37:18,163 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
2019-12-12 10:37:18,237 INFO akka.remote.Remoting - Starting remoting
2019-12-12 10:37:18,366 INFO akka.remote.Remoting - Remoting started; listening on addresses :[akka.tcp://flink#flink-job-cluster:6123]
2019-12-12 10:37:18,375 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Actor system started at akka.tcp://flink#flink-job-cluster:6123
2019-12-12 10:37:18,398 INFO org.apache.flink.configuration.Configuration - Config uses fallback configuration key 'jobmanager.rpc.address' instead of key 'rest.address'
2019-12-12 10:37:18,407 INFO org.apache.flink.runtime.blob.BlobServer - Created BLOB server storage directory /tmp/blobStore-63338044-67c1-4872-a3d9-c94563b3a7c3
2019-12-12 10:37:18,412 INFO org.apache.flink.runtime.blob.BlobServer - Started BLOB server at 0.0.0.0:6124 - max concurrent requests: 50 - max backlog: 1000
2019-12-12 10:37:18,428 INFO org.apache.flink.runtime.metrics.MetricRegistryImpl - No metrics reporter configured, no metrics will be exposed/reported.
2019-12-12 10:37:18,430 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Trying to start actor system at flink-job-cluster:0
2019-12-12 10:37:18,464 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
2019-12-12 10:37:18,472 INFO akka.remote.Remoting - Starting remoting
2019-12-12 10:37:18,480 INFO akka.remote.Remoting - Remoting started; listening on addresses :[akka.tcp://flink-metrics#flink-job-cluster:33529]
2019-12-12 10:37:18,482 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Actor system started at akka.tcp://flink-metrics#flink-job-cluster:33529
2019-12-12 10:37:18,490 INFO org.apache.flink.runtime.blob.TransientBlobCache - Created BLOB cache storage directory /tmp/blobStore-ba64dcdb-5095-41fc-9c98-0f1528d95c40
2019-12-12 10:37:18,514 INFO org.apache.flink.configuration.Configuration - Config uses fallback configuration key 'jobmanager.rpc.address' instead of key 'rest.address'
2019-12-12 10:37:18,515 WARN org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - Upload directory /tmp/flink-web-f6be0c2d-5099-4bd6-bc72-a0ae1fc6448e/flink-web-upload does not exist, or has been deleted externally. Previously uploaded files are no longer available.
2019-12-12 10:37:18,516 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - Created directory /tmp/flink-web-f6be0c2d-5099-4bd6-bc72-a0ae1fc6448e/flink-web-upload for file uploads.
2019-12-12 10:37:18,603 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - Starting rest endpoint.
2019-12-12 10:37:18,872 WARN org.apache.flink.runtime.webmonitor.WebMonitorUtils - Log file environment variable 'log.file' is not set.
2019-12-12 10:37:18,872 WARN org.apache.flink.runtime.webmonitor.WebMonitorUtils - JobManager log files are unavailable in the web dashboard. Log file location not found in environment variable 'log.file' or configuration key 'Key: 'web.log.path' , default: null (fallback keys: [{key=jobmanager.web.log.path, isDeprecated=true}])'.
2019-12-12 10:37:19,115 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - Rest endpoint listening at flink-job-cluster:8081
2019-12-12 10:37:19,116 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - http://flink-job-cluster:8081 was granted leadership with leaderSessionID=00000000-0000-0000-0000-000000000000
2019-12-12 10:37:19,116 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - Web frontend listening at http://flink-job-cluster:8081.
2019-12-12 10:37:19,239 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.resourcemanager.StandaloneResourceManager at akka://flink/user/resourcemanager .
2019-12-12 10:37:19,262 INFO org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever - Scanning class path for job JAR
2019-12-12 10:37:19,270 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - Shutting down rest endpoint.
2019-12-12 10:37:19,295 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - Removing cache directory /tmp/flink-web-f6be0c2d-5099-4bd6-bc72-a0ae1fc6448e/flink-web-ui
2019-12-12 10:37:19,299 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - http://flink-job-cluster:8081 lost leadership
2019-12-12 10:37:19,299 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - Shut down complete.
2019-12-12 10:37:19,302 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Shutting StandaloneJobClusterEntryPoint down with application status FAILED. Diagnostics org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:257)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:172)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:171)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:535)
at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:105)
Caused by: org.apache.flink.util.FlinkException: Failed to find job JAR on class path. Please provide the job class name explicitly.
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.getJobClassNameOrScanClassPath(ClassPathJobGraphRetriever.java:131)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:114)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:96)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:62)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:41)
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:184)
... 6 more
Caused by: java.util.NoSuchElementException: No JAR with manifest attribute for entry class
at org.apache.flink.container.entrypoint.JarManifestParser.findOnlyEntryClass(JarManifestParser.java:80)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.scanClassPathForJobJar(ClassPathJobGraphRetriever.java:137)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.getJobClassNameOrScanClassPath(ClassPathJobGraphRetriever.java:129)
... 11 more
.
2019-12-12 10:37:19,305 INFO org.apache.flink.runtime.blob.BlobServer - Stopped BLOB server at 0.0.0.0:6124
2019-12-12 10:37:19,305 INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache
2019-12-12 10:37:19,315 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopping Akka RPC service.
2019-12-12 10:37:19,320 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down remote daemon.
2019-12-12 10:37:19,321 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down remote daemon.
2019-12-12 10:37:19,323 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon shut down; proceeding with flushing remote transports.
2019-12-12 10:37:19,325 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon shut down; proceeding with flushing remote transports.
2019-12-12 10:37:19,354 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut down.
2019-12-12 10:37:19,356 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut down.
2019-12-12 10:37:19,378 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopped Akka RPC service.
2019-12-12 10:37:19,382 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Could not start cluster entrypoint StandaloneJobClusterEntryPoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint StandaloneJobClusterEntryPoint.
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:190)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:535)
at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:105)
Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:257)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:172)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:171)
... 2 more
Caused by: org.apache.flink.util.FlinkException: Failed to find job JAR on class path. Please provide the job class name explicitly.
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.getJobClassNameOrScanClassPath(ClassPathJobGraphRetriever.java:131)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:114)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:96)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:62)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:41)
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:184)
... 6 more
Caused by: java.util.NoSuchElementException: No JAR with manifest attribute for entry class
at org.apache.flink.container.entrypoint.JarManifestParser.findOnlyEntryClass(JarManifestParser.java:80)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.scanClassPathForJobJar(ClassPathJobGraphRetriever.java:137)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.getJobClassNameOrScanClassPath(ClassPathJobGraphRetriever.java:129)
... 11 more
develk#ACIDLAELKV01:~/cntx_eng$
Now, I have added Job class name as in argument section of "job-cluster-job.yaml.template" file.
Like below:
args: ["job-cluster",
"--job-classname", "com.flink.wordCountSimple",
"-Djobmanager.rpc.address=flink-job-cluster",
But after that I am getting below exception:
Caused by: org.apache.flink.util.FlinkException: Could not load the provided entrypoint class.
Please see below detail log.
2019-12-13 19:08:34,323 INFO org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - Shut down complete.
2019-12-13 19:08:34,329 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Shutting StandaloneJobClusterEntryPoint down with application status FAILED. Diagnostics org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:257)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:172)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:171)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:535)
at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:105)
Caused by: org.apache.flink.util.FlinkException: Could not load the provided entrypoint class.
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:119)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:96)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:62)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:41)
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:184)
... 6 more
Caused by: java.lang.ClassNotFoundException: com.flink.wordCountSimple
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:116)
... 10 more
.
2019-12-13 19:08:34,337 INFO org.apache.flink.runtime.blob.BlobServer - Stopped BLOB server at 0.0.0.0:6124
2019-12-13 19:08:34,338 INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache
2019-12-13 19:08:34,364 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopping Akka RPC service.
2019-12-13 19:08:34,368 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down remote daemon.
2019-12-13 19:08:34,372 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon shut down; proceeding with flushing remote transports.
2019-12-13 19:08:34,392 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down remote daemon.
2019-12-13 19:08:34,392 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon shut down; proceeding with flushing remote transports.
2019-12-13 19:08:34,406 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut down.
2019-12-13 19:08:34,410 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut down.
2019-12-13 19:08:34,434 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopped Akka RPC service.
2019-12-13 19:08:34,443 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Could not start cluster entrypoint StandaloneJobClusterEntryPoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint StandaloneJobClusterEntryPoint.
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:190)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:535)
at org.apache.flink.container.entrypoint.StandaloneJobClusterEntryPoint.main(StandaloneJobClusterEntryPoint.java:105)
Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:257)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:172)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:171)
... 2 more
Caused by: org.apache.flink.util.FlinkException: Could not load the provided entrypoint class.
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:119)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.retrieveJobGraph(ClassPathJobGraphRetriever.java:96)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:62)
at org.apache.flink.runtime.dispatcher.JobDispatcherFactory.createDispatcher(JobDispatcherFactory.java:41)
at org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentFactory.java:184)
... 6 more
Caused by: java.lang.ClassNotFoundException: com.flink.wordCountSimple
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.flink.container.entrypoint.ClassPathJobGraphRetriever.createPackagedProgram(ClassPathJobGraphRetriever.java:116)
... 10 more
There's a complete, working example of creating and running a flink job cluster on kubernetes in https://github.com/alpinegizmo/flink-containers-example. Maybe that will help. See also https://www.youtube.com/watch?v=ceZtUDgh2TE.
version: "2.1"
services:
jobmanager:
build:
context: ./
args:
JAR_FILE: flink-event-tracker-bundled-1.6.0.jar
image: test/flink-event-tracker
expose:
- "6123"
ports:
- "8081:8081"
- "6123:6123"
command: job-cluster --job-classname com.company.test.flink.pipelines.KafkaPipelineConsumer -Djobmanager.rpc.address=jobmanager --runner=FlinkRunner --streaming=true --checkpointingInterval=30000
environment:
- JOB_MANAGER_RPC_ADDRESS=jobmanager
- JOB_MANAGER=jobmanager
volumes:
- data-volume:/docker/volumes
taskmanager:
image: test/flink-event-tracker
expose:
- "6121"
- "6122"
depends_on:
- jobmanager
command: task-manager -Djobmanager.rpc.address=jobmanager
links:
- "jobmanager:jobmanager"
environment:
- JOB_MANAGER_RPC_ADDRESS=jobmanager
- JOB_MANAGER=jobmanager
volumes:
- data-volume:/docker/volumes
volumes:
data-volume:
driver: local
driver_opts:
o: bind
type: none
device: /Users/home/Development/docker/volumes/flink
Docker file
FROM flink:1.9
ARG JAR_FILE=""
ENV APP_OPTS ""
ENV JAVA_OPTS ""
ENV JOB_MANAGER=""
# Build arg allows passing the version at runtime
ARG VERSION=unset-version
COPY flink-conf.yml $FLINK_HOME/conf/flink-conf.yaml
COPY target/$JAR_FILE $FLINK_HOME/lib/event-tracker.jar
COPY docker-cluster-entrypoint.sh /docker-cluster-entrypoint.sh
RUN apt-get update && apt-get install procps -y && apt-get install curl -y
RUN echo "root:root" | chpasswd
RUN chmod 777 /docker-cluster-entrypoint.sh
RUN chmod 777 $FLINK_HOME/lib/event-tracker.jar
ENTRYPOINT [ "bash","/docker-cluster-entrypoint.sh" ]
docker-cluster-entrypoint.sh
FLINK_HOME=${FLINK_HOME:-"/opt/flink/bin"}
JOB_CLUSTER="job-cluster"
TASK_MANAGER="task-manager"
CMD="$1"
shift;
if [ "${CMD}" = "--help" -o "${CMD}" = "-h" ]; then
echo "Usage: $(basename $0) (${JOB_CLUSTER}|${TASK_MANAGER})"
exit 0
elif [ "${CMD}" = "${JOB_CLUSTER}" -o "${CMD}" = "${TASK_MANAGER}" ]; then
echo "Starting the ${CMD}"
if [ "${CMD}" = "${TASK_MANAGER}" ]; then
exec $FLINK_HOME/bin/taskmanager.sh start-foreground "$#"
else
exec $FLINK_HOME/bin/standalone-job.sh start-foreground "$#"
fi
fi
How to run:-
mvn clean install
docker-compose -f docker-compose.local.yml up --scale taskmanager=2 > exceptionlog.log
docker-compose -f docker-compose.local.yml build
this is the entire conf that runs your docker. but if you want to run in kube, just convert the docker-compose file to its corresponding kube files...remaining can stay same.. may be do a helm that way kube maintenance is better.
Note:- we are using apache beam to code the job

Kubernetes 1.9 can't initialize SparkContext

Trying to catch up with the Spark 2.3 documentation on how to deploy jobs on a Kubernetes 1.9.3 cluster : http://spark.apache.org/docs/latest/running-on-kubernetes.html
The Kubernetes 1.9.3 cluster is operating properly on offline bare-metal servers and was installed with kubeadm. The following command was used to submit the job (SparkPi example job):
/opt/spark/bin/spark-submit --master k8s://https://k8s-master:6443 --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=2 --conf spark.kubernetes.container.image=spark:v2.3.0 local:///opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar
Here is the stacktrace that we all love:
++ id -u
+ myuid=0
++ id -g
+ mygid=0
++ getent passwd 0
+ uidentry=root:x:0:0:root:/root:/bin/ash
+ '[' -z root:x:0:0:root:/root:/bin/ash ']'
+ SPARK_K8S_CMD=driver
+ '[' -z driver ']'
+ shift 1
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ env
+ grep SPARK_JAVA_OPT_
+ sed 's/[^=]*=\(.*\)/\1/g'
+ readarray -t SPARK_JAVA_OPTS
+ '[' -n /opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar ']'
+ SPARK_CLASSPATH=':/opt/spark/jars/*:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar'
+ '[' -n '' ']'
+ case "$SPARK_K8S_CMD" in
+ CMD=(${JAVA_HOME}/bin/java "${SPARK_JAVA_OPTS[#]}" -cp "$SPARK_CLASSPATH" -Xms$SPARK_DRIVER_MEMORY -Xmx$SPARK_DRIVER_MEMORY -Dspark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS $SPARK_DRIVER_CLASS $SPARK_DRIVER_ARGS)
+ exec /sbin/tini -s -- /usr/lib/jvm/java-1.8-openjdk/bin/java -Dspark.kubernetes.driver.pod.name=spark-pi-b6f8a60df70a3b9d869c4e305518f43a-driver -Dspark.driver.port=7078 -Dspark.submit.deployMode=cluster -Dspark.master=k8s://https://k8s-master:6443 -Dspark.kubernetes.executor.podNamePrefix=spark-pi-b6f8a60df70a3b9d869c4e305518f43a -Dspark.driver.blockManager.port=7079 -Dspark.app.id=spark-7077ad8f86114551b0ae04ae63a74d5a -Dspark.driver.host=spark-pi-b6f8a60df70a3b9d869c4e305518f43a-driver-svc.default.svc -Dspark.app.name=spark-pi -Dspark.kubernetes.container.image=spark:v2.3.0 -Dspark.jars=/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar,/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar -Dspark.executor.instances=2 -cp ':/opt/spark/jars/*:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar' -Xms1g -Xmx1g -Dspark.driver.bindAddress=10.244.1.17 org.apache.spark.examples.SparkPi
2018-03-07 12:39:35 INFO SparkContext:54 - Running Spark version 2.3.0
2018-03-07 12:39:36 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-03-07 12:39:36 INFO SparkContext:54 - Submitted application: Spark Pi
2018-03-07 12:39:36 INFO SecurityManager:54 - Changing view acls to: root
2018-03-07 12:39:36 INFO SecurityManager:54 - Changing modify acls to: root
2018-03-07 12:39:36 INFO SecurityManager:54 - Changing view acls groups to:
2018-03-07 12:39:36 INFO SecurityManager:54 - Changing modify acls groups to:
2018-03-07 12:39:36 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
2018-03-07 12:39:36 INFO Utils:54 - Successfully started service 'sparkDriver' on port 7078.
2018-03-07 12:39:36 INFO SparkEnv:54 - Registering MapOutputTracker
2018-03-07 12:39:36 INFO SparkEnv:54 - Registering BlockManagerMaster
2018-03-07 12:39:36 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2018-03-07 12:39:36 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2018-03-07 12:39:36 INFO DiskBlockManager:54 - Created local directory at /tmp/blockmgr-7f5370ad-b495-4943-ad75-285b7ead3e5b
2018-03-07 12:39:36 INFO MemoryStore:54 - MemoryStore started with capacity 408.9 MB
2018-03-07 12:39:36 INFO SparkEnv:54 - Registering OutputCommitCoordinator
2018-03-07 12:39:36 INFO log:192 - Logging initialized #1936ms
2018-03-07 12:39:36 INFO Server:346 - jetty-9.3.z-SNAPSHOT
2018-03-07 12:39:36 INFO Server:414 - Started #2019ms
2018-03-07 12:39:36 INFO AbstractConnector:278 - Started ServerConnector#4215838f{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-03-07 12:39:36 INFO Utils:54 - Successfully started service 'SparkUI' on port 4040.
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#5b6813df{/jobs,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#495083a0{/jobs/json,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#5fd62371{/jobs/job,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#2b62442c{/jobs/job/json,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#66629f63{/stages,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#841e575{/stages/json,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#27a5328c{/stages/stage,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#6b5966e1{/stages/stage/json,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#65e61854{/stages/pool,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#1568159{/stages/pool/json,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#4fcee388{/storage,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#6f80fafe{/storage/json,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#3af17be2{/storage/rdd,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#f9879ac{/storage/rdd/json,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#37f21974{/environment,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#5f4d427e{/environment/json,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#6e521c1e{/executors,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#224b4d61{/executors/json,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#5d5d9e5{/executors/threadDump,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#303e3593{/executors/threadDump/json,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#4ef27d66{/static,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#62dae245{/,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#4b6579e8{/api,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#3954d008{/jobs/job/kill,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler#2f94c4db{/stages/stage/kill,null,AVAILABLE,#Spark}
2018-03-07 12:39:36 INFO SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://spark-pi-b6f8a60df70a3b9d869c4e305518f43a-driver-svc.default.svc:4040
2018-03-07 12:39:36 INFO SparkContext:54 - Added JAR /opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar at spark://spark-pi-b6f8a60df70a3b9d869c4e305518f43a-driver-svc.default.svc:7078/jars/spark-examples_2.11-2.3.0.jar with timestamp 1520426376949
2018-03-07 12:39:37 WARN KubernetesClusterManager:66 - The executor's init-container config map is not specified. Executors will therefore not attempt to fetch remote or submitted dependencies.
2018-03-07 12:39:37 WARN KubernetesClusterManager:66 - The executor's init-container config map key is not specified. Executors will therefore not attempt to fetch remote or submitted dependencies.
2018-03-07 12:39:42 ERROR SparkContext:91 - Error initializing SparkContext.
org.apache.spark.SparkException: External scheduler cannot be instantiated
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2747)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:492)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [spark-pi-b6f8a60df70a3b9d869c4e305518f43a-driver] in namespace: [default] failed.
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62)
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:228)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:184)
at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:70)
at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:120)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2741)
... 8 more
Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try again
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at okhttp3.Dns$1.lookup(Dns.java:39)
at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:171)
at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:137)
at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:82)
at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:171)
at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185)
at okhttp3.RealCall.execute(RealCall.java:69)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:377)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:343)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:312)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:295)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:783)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:217)
... 12 more
2018-03-07 12:39:42 INFO AbstractConnector:318 - Stopped Spark#4215838f{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-03-07 12:39:42 INFO SparkUI:54 - Stopped Spark web UI at http://spark-pi-b6f8a60df70a3b9d869c4e305518f43a-driver-svc.default.svc:4040
2018-03-07 12:39:42 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2018-03-07 12:39:42 INFO MemoryStore:54 - MemoryStore cleared
2018-03-07 12:39:42 INFO BlockManager:54 - BlockManager stopped
2018-03-07 12:39:42 INFO BlockManagerMaster:54 - BlockManagerMaster stopped
2018-03-07 12:39:42 WARN MetricsSystem:66 - Stopping a MetricsSystem that is not running
2018-03-07 12:39:42 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2018-03-07 12:39:42 INFO SparkContext:54 - Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: External scheduler cannot be instantiated
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2747)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:492)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [spark-pi-b6f8a60df70a3b9d869c4e305518f43a-driver] in namespace: [default] failed.
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62)
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:228)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:184)
at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:70)
at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:120)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2741)
... 8 more
Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try again
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at okhttp3.Dns$1.lookup(Dns.java:39)
at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:171)
at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:137)
at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:82)
at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:171)
at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)
at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185)
at okhttp3.RealCall.execute(RealCall.java:69)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:377)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:343)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:312)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:295)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:783)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:217)
... 12 more
2018-03-07 12:39:42 INFO ShutdownHookManager:54 - Shutdown hook called
2018-03-07 12:39:42 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-64fe7ad8-669f-4591-a3f6-67440d450a44
So apparently the Kubernetes Scheduler Backend cannot contact the pod because it is unable to resolve kubernetes.default.svc. Hum.. why?
I also configured RBAC with a spark service account as mentionned in the documentation but the same problem occurs. (also tried on a different namespace, same problem)
Here are the logs from kube-dns:
I0306 16:04:04.170889 1 dns.go:555] Could not find endpoints for service "spark-pi-b9e8b4c66fe83c4d94a8d46abc2ee8f5-driver-svc" in namespace "default". DNS records will be created once endpoints show up.
I0306 16:04:29.751201 1 dns.go:555] Could not find endpoints for service "spark-pi-0665ad323820371cb215063987a31e05-driver-svc" in namespace "default". DNS records will be created once endpoints show up.
I0306 16:06:26.414146 1 dns.go:555] Could not find endpoints for service "spark-pi-2bf24282e8033fa9a59098616323e267-driver-svc" in namespace "default". DNS records will be created once endpoints show up.
I0307 08:16:17.404971 1 dns.go:555] Could not find endpoints for service "spark-pi-3887031e031732108711154b2ec57d28-driver-svc" in namespace "default". DNS records will be created once endpoints show up.
I0307 08:17:11.682218 1 dns.go:555] Could not find endpoints for service "spark-pi-3d84127226393fc99e2fe035db56bfb5-driver-svc" in namespace "default". DNS records will be created once endpoints show up.
I really can't figure out why those errors come up.
Try to alter the pod network with one method except Calico, check whether kube-dns work well.
To create a custom service account, a user can use the kubectl create serviceaccount command. For example, the following command creates a service account named spark:
$ kubectl create serviceaccount spark
To grant a service account a Role or ClusterRole, a RoleBinding or ClusterRoleBinding is needed. To create a RoleBinding or ClusterRoleBinding, a user can use the kubectl create rolebinding (or clusterrolebinding for ClusterRoleBinding) command. For example, the following command creates an edit ClusterRole in the default namespace and grants it to the spark service account created above:
$ kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default
Depending on the version and setup of Kubernetes deployed, this default service account may or may not have the role that allows driver pods to create pods and services under the default Kubernetes RBAC policies. Sometimes users may need to specify a custom service account that has the right role granted. Spark on Kubernetes supports specifying a custom service account to be used by the driver pod through the configuration property spark.kubernetes.authenticate.driver.serviceAccountName=. For example to make the driver pod use the spark service account, a user simply adds the following option to the spark-submit command:
spark-submit --master k8s://https://192.168.1.5:6443 --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=5 --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark --conf spark.kubernetes.container.image=leeivan/spark:latest local:///opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar
I faced the same issue. If you're using minikube. Try deleting minikube using minikube delete and minikube start.
Then create serviceaccount and clusterrolebinding
To add to openbrace's answer,
and based on Ivan Lee's answer too,
if you are using minikube,
running the following command was enough for me:
kubectl create clusterrolebinding default --clusterrole=edit --serviceaccount=default:default --namespace=default
That way, I didn't have to change spark.kubernetes.authenticate.driver.serviceAccountName when using spark-submit.
Test if pod with your spark image can resolve dns
create test_dns.yaml file:
apiVersion: v1
kind: Pod
metadata:
name: testdns
namespace: default
spec:
containers:
- name: testdns
image: <your-spark-image>
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
apply it
kubectl apply -f test-dns.yaml
run nslookup for kubernetes.default.svc
kubectl exec -ti testdns -- nslookup kubernetes.default.svc
run it multiple time to check if your dns run consistently

datastax LDAP config error "Unable to find property"

I have installed dse 5.1 testing 2 node cassandra cluster which working fine.
I need to configure LDAP Authentication.
Below is my dse.yaml file and cassandra .yaml file
=========================
server_host: hostname
server_port: 389
search_dn: cn=username
search_password: ldappassword
user_search_base: dc=test,dc=testdomain,dc=com
user_memberof_attribute: member
group_search_type: directory_search#
group_search_filter: (&(cn=*)(objectclass=group))
group_name_attribute: cn
credentials_validity_in_ms: 0
connection_pool:
max_active: 8
max_idle: 8
========================================
cassandra.yaml
authenticator: com.datastax.bdp.datastax.bdp.cassandra.auth.LdapAuthenticator
authorizer: com.datastax.bdp.cassandra.auth.DseAuthorizer
role_manager: com.datastax.bdp.cassandra.auth.DseRoleManager
roles_validity_in_ms: 2000
dse version
[root#hostname)dse -v
5.1.3
[root#hostname dse]#
=========================================
Error i am getting
====================================================
ned_function_warn_timeout=500; user_function_timeout_policy=die; windows_timer_interval=1; write_request_timeout_in_ms=2000]
INFO [main] 2017-10-18 09:45:54,428 DatabaseDescriptor.java:368 - DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
INFO [main] 2017-10-18 09:45:54,428 DatabaseDescriptor.java:422 - Global memtable on-heap threshold is enabled at 8192MB
INFO [main] 2017-10-18 09:45:54,428 DatabaseDescriptor.java:426 - Global memtable off-heap threshold is enabled at 8192MB
INFO [main] 2017-10-18 09:45:54,447 RateBasedBackPressure.java:123 - Initialized back-pressure with high ratio: 0.9, factor: 5, flow: FAST,
window size: 2000.
INFO [main] 2017-10-18 09:45:54,447 DatabaseDescriptor.java:718 - Back-pressure is disabled with strategy org.apache.cassandra.net.RateBase
dBackPressure{high_ratio=0.9, factor=5, flow=FAST}.
INFO [main] 2017-10-18 09:45:54,468 DseDelegateSnitch.java:40 - Setting my workloads to [Analytics, Cassandra]
INFO [main] 2017-10-18 09:45:54,473 DseConfigYamlLoader.java:38 - Loading settings from file:/etc/dse/dse.yaml
ERROR [main] 2017-10-18 09:45:54,516 DseModule.java:109 - Unable to start server. Exiting..
org.yaml.snakeyaml.error.YAMLException: Unable to find property 'server_host' on class: com.datastax.bdp.config.Config
at com.datastax.bdp.config.DseYamlPropertyUtils.getProperty(DseYamlPropertyUtils.java:70) ~[dse-core-5.1.3.jar:5.1.3]
at org.yaml.snakeyaml.introspector.PropertyUtils.getProperty(PropertyUtils.java:121) ~[snakeyaml-1.12.jar:na]
at org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.getProperty(Constructor.java:308) ~[snakeyaml-1.12.jar:na]
at org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.constructJavaBean2ndStep(Constructor.java:240) ~[snakeyaml-1.12.jar:na
]
at org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.construct(Constructor.java:189) ~[snakeyaml-1.12.jar:na]
at org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:331) ~[snakeyaml-1.12.jar:na]
at org.yaml.snakeyaml.constructor.BaseConstructor.constructObject(BaseConstructor.java:182) ~[snakeyaml-1.12.jar:na]
at org.yaml.snakeyaml.constructor.BaseConstructor.constructDocument(BaseConstructor.java:141) ~[snakeyaml-1.12.jar:na]
at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:127) ~[snakeyaml-1.12.jar:na]
at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:481) ~[snakeyaml-1.12.jar:na]
at org.yaml.snakeyaml.Yaml.loadAs(Yaml.java:475) ~[snakeyaml-1.12.jar:na]
at com.datastax.bdp.config.DseConfigYamlLoader.(DseConfigYamlLoader.java:57) ~[dse-core-5.1.3.jar:5.1.3]
at com.datastax.bdp.snitch.DseDelegateSnitch.(DseDelegateSnitch.java:41) ~[dse-core-5.1.3.jar:5.1.3]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_144]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_144]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_144]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_144]
at java.lang.Class.newInstance(Class.java:442) ~[na:1.8.0_144]
at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:525) ~[cassandra-all-3.11.0.1855.jar:3.11.0.1855]
at org.apache.cassandra.utils.FBUtilities.construct(FBUtilities.java:518) ~[cassandra-all-3.11.0.1855.jar:3.11.0.1855]
at org.apache.cassandra.config.DatabaseDescriptor.createEndpointSnitch(DatabaseDescriptor.java:1028) ~[cassandra-all-3.11.0.1855.jar::
==============================================================
It looks like you removed the parent key ldap_options for all the LDAP options in dse.yaml. All of those sub-keys you specified need to be nested under the ldap_options root key.

how can i run jhipster-registry in my external server with war file

i have an application with one microservice and one gateway. I deploy my war(s) application in my external server
so how can i deploy jhipster-registry in my external server with war file
Thanks.
#Gaël Marziou
The log file of my Registry
:: Running Spring Boot 1.3.6.
:: http://jhipster.github.io
::-10-09 03:40:55.863 INFO 5278 --- [ost-startStop-1] i.g.jhipster.registry.ApplicationWebXml : The following profiles are active: dev
2016-10-09 03:40:57.076 WARN 5278 --- [ost-startStop-1] o.s.c.a.ConfigurationClassPostProcessor : Cannot enhance #Configuration bean definitio$
2016-10-09 03:40:57.646 DEBUG 5278 --- [ost-startStop-1] i.g.j.r.config.MetricsConfiguration : Registering JVM gauges
2016-10-09 03:40:57.672 DEBUG 5278 --- [ost-startStop-1] i.g.j.r.config.MetricsConfiguration : Initializing Metrics JMX reporting
2016-10-09 03:40:59.427 INFO 5278 --- [ost-startStop-1] i.g.j.registry.config.WebConfigurer : Web application configuration, using profile$
2016-10-09 03:40:59.427 DEBUG 5278 --- [ost-startStop-1] i.g.j.registry.config.WebConfigurer : Initializing Metrics registries
2016-10-09 03:40:59.430 DEBUG 5278 --- [ost-startStop-1] i.g.j.registry.config.WebConfigurer : Registering Metrics Filter
2016-10-09 03:40:59.431 DEBUG 5278 --- [ost-startStop-1] i.g.j.registry.config.WebConfigurer : Registering Metrics Servlet
2016-10-09 03:40:59.431 INFO 5278 --- [ost-startStop-1] i.g.j.registry.config.WebConfigurer : Web application fully configured
2016-10-09 03:40:59.442 INFO 5278 --- [ost-startStop-1] i.g.jhipster.registry.JHipsterRegistry : Running with Spring profile(s) : [dev]
2016-10-09 03:41:00.536 INFO 5278 --- [ost-startStop-1] com.netflix.discovery.DiscoveryClient : Client configured to neither register nor qu$
2016-10-09 03:41:00.545 INFO 5278 --- [ost-startStop-1] com.netflix.discovery.DiscoveryClient : Discovery Client initialized at timestamp 14$
2016-10-09 03:41:00.623 INFO 5278 --- [ost-startStop-1] c.n.eureka.DefaultEurekaServerContext : Initializing ...
2016-10-09 03:41:00.625 INFO 5278 --- [ost-startStop-1] c.n.eureka.cluster.PeerEurekaNodes : Adding new peer nodes [http ://admin:admin#lo$
2016-10-09 03:41:00.747 INFO 5278 --- [ost-startStop-1] c.n.d.provider.DiscoveryJerseyProvider : Using JSON encoding codec LegacyJacksonJson
2016-10-09 03:41:00.748 INFO 5278 --- [ost-startStop-1] c.n.d.provider.DiscoveryJerseyProvider : Using JSON decoding codec LegacyJacksonJson
2016-10-09 03:41:00.748 INFO 5278 --- [ost-startStop-1] c.n.d.provider.DiscoveryJerseyProvider : Using XML encoding codec XStreamXml
2016-10-09 03:41:00.748 INFO 5278 --- [ost-startStop-1] c.n.d.provider.DiscoveryJerseyProvider : Using XML decoding codec XStreamXml
2016-10-09 03:41:00.989 INFO 5278 --- [ost-startStop-1] c.n.eureka.cluster.PeerEurekaNodes : Replica node URL: http ://admin:admin#localh$
2016-10-09 03:41:00.997 INFO 5278 --- [ost-startStop-1] c.n.e.registry.AbstractInstanceRegistry : Finished initializing remote region registri$
2016-10-09 03:41:00.998 INFO 5278 --- [ost-startStop-1] c.n.eureka.DefaultEurekaServerContext : Initialized
2016-10-09 03:41:02.969 WARN 5278 --- [ost-startStop-1] c.n.c.sources.URLConfigurationSource : No URLs will be polled as dynamic configurat$
2016-10-09 03:41:02.969 INFO 5278 --- [ost-startStop-1] c.n.c.sources.URLConfigurationSource : To enable URLs as dynamic configuration sour$
2016-10-09 03:41:02.977 WARN 5278 --- [ost-startStop-1] c.n.c.sources.URLConfigurationSource : No URLs will be polled as dynamic configurat$
2016-10-09 03:41:02.977 INFO 5278 --- [ost-startStop-1] c.n.c.sources.URLConfigurationSource : To enable URLs as dynamic configuration sour$
2016-10-09 03:41:05.582 INFO 5278 --- [ Thread-13] c.n.e.r.PeerAwareInstanceRegistryImpl : Got 1 instances from neighboring DS node
2016-10-09 03:41:05.583 INFO 5278 --- [ Thread-13] c.n.e.r.PeerAwareInstanceRegistryImpl : Renew threshold is: 1
2016-10-09 03:41:05.583 INFO 5278 --- [ Thread-13] c.n.e.r.PeerAwareInstanceRegistryImpl : Changing status to UP
2016-10-09 03:41:05.660 INFO 5278 --- [ost-startStop-1] i.g.jhipster.registry.ApplicationWebXml : Started ApplicationWebXml in 11.934 seconds $
Oct 09, 2016 3:41:05 AM org.apache.catalina.core.ContainerBase startInternal
SEVERE: A child container failed during start
java.util.concurrent.ExecutionException: org.apache.catalina.LifecycleException: Failed to start component [StandardEngine[Catalina].StandardHos$
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:1123)
at org.apache.catalina.core.StandardHost.startInternal(StandardHost.java:799)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1559)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1549)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.catalina.LifecycleException: Failed to start component [StandardEngine[Catalina].StandardHost[localhost].StandardContext[]]
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:154)
... 6 more
Caused by: java.lang.NoSuchMethodError: javax.servlet.ServletContext.getVirtualServerName()Ljava/lang/String;
at org.apache.tomcat.websocket.server.WsServerContainer.(WsServerContainer.java:150)
at org.apache.tomcat.websocket.server.WsSci.init(WsSci.java:131)
at org.apache.tomcat.websocket.server.WsSci.onStartup(WsSci.java:47)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5493)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
... 6 more
Oct 09, 2016 3:41:05 AM org.apache.catalina.core.ContainerBase startInternal
SEVERE: A child container failed during start
java.util.concurrent.ExecutionException: org.apache.catalina.LifecycleException: Failed to start component [StandardEngine[Catalina].StandardHos$
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:1123)
at org.apache.catalina.core.StandardEngine.startInternal(StandardEngine.java:300)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at org.apache.catalina.core.StandardService.startInternal(StandardService.java:443)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at org.apache.catalina.core.StandardServer.startInternal(StandardServer.java:731)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at org.apache.catalina.startup.Catalina.start(Catalina.java:689)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:321)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:455)
Caused by: org.apache.catalina.LifecycleException: Failed to start component [StandardEngine[Catalina].StandardHost[localhost]]
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:154)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1559)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1549)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.catalina.LifecycleException: A child container failed during start
at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:1131)
at org.apache.catalina.core.StandardHost.startInternal(StandardHost.java:799)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
... 6 more
Oct 09, 2016 3:41:05 AM org.apache.catalina.startup.Catalina start
SEVERE: The required Server component failed to start so Tomcat is unable to start.
org.apache.catalina.LifecycleException: Failed to start component [StandardServer[8004]]
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:154)
at org.apache.catalina.startup.Catalina.start(Catalina.java:689)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:321)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:455)
Caused by: org.apache.catalina.LifecycleException: Failed to start component [StandardService[Catalina]]
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:154)
at org.apache.catalina.core.StandardServer.startInternal(StandardServer.java:731)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
... 7 more
Caused by: org.apache.catalina.LifecycleException: Failed to start component [StandardEngine[Catalina]]
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:154)
at org.apache.catalina.core.StandardService.startInternal(StandardService.java:443)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
... 9 more
Caused by: org.apache.catalina.LifecycleException: A child container failed during start
at org.apache.catalina.core.ContainerBase.startInternal(ContainerBase.java:1131)
at org.apache.catalina.core.StandardEngine.startInternal(StandardEngine.java:300)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
... 11 more
Oct 09, 2016 3:41:05 AM org.apache.coyote.AbstractProtocol pause
INFO: Pausing ProtocolHandler ["http-bio-8762"]
Oct 09, 2016 3:41:05 AM org.apache.catalina.core.StandardService stopInternal
INFO: Stopping service Catalina
Oct 09, 2016 3:41:05 AM org.apache.coyote.AbstractProtocol destroy
INFO: Destroying ProtocolHandler ["http-bio-8762"]
2016-10-09 03:41:05.706 INFO 5278 --- [ main] com.netflix.discovery.DiscoveryClient : Completed shut down of DiscoveryClient
2016-10-09 03:41:05.725 INFO 5278 --- [ main] c.n.eureka.DefaultEurekaServerContext : Shutting down ...
2016-10-09 03:41:05.735 INFO 5278 --- [ main] c.n.eureka.DefaultEurekaServerContext : Shut down
Oct 09, 2016 3:41:05 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [ReplicaAwareInstanceRegistry - RenewalThresholdUpdater] but has failed to$
Oct 09, 2016 3:41:05 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [Eureka-JerseyClient-Conn-Cleaner2] but has failed to stop it. This is ver$
Oct 09, 2016 3:41:05 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [StatsMonitor-0] but has failed to stop it. This is very likely to create $
Oct 09, 2016 3:41:05 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [] appears to have started a thread named [Eureka-CacheFillTimer] but has failed to stop it. This is very likely to $
Oct 09, 2016 3:41:05 AM org.apache.catalina.loader.WebappClassLoader loadClass
INFO: Illegal access: this web application instance has been stopped already. Could not load org.eclipse.jgit.util.FileUtils. The eventual fol$
java.lang.IllegalStateException
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1610)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1569)
at org.springframework.cloud.config.server.support.AbstractScmAccessor$1.run(AbstractScmAccessor.java:91)
Exception in thread "Thread-5" java.lang.NoClassDefFoundError: org/eclipse/jgit/util/FileUtils
at org.springframework.cloud.config.server.support.AbstractScmAccessor$1.run(AbstractScmAccessor.java:91)
Caused by: java.lang.ClassNotFoundException: org.eclipse.jgit.util.FileUtils
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1718)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1569)
... 1 more

hiveserver2 can not run sql on Spark on Yarn

Here is my versions:
Hive: 1.2
Hadoop: CDH5.3
Spark: 1.4.1
I succeeded with hive on spark with hive client, but after I started hiveserver2 and tried a sql using beeline, it failed.
The error is:
2015-11-29 21:49:42,786 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/29 21:49:42 INFO spark.SparkContext: Added JAR file:/root/cdh/apache-hive-1.2.1-bin/lib/hive-exec-1.2.1.jar at http://10.96.30.51:10318/jars/hive-exec-1.2.1.jar with timestamp 1448804982784
2015-11-29 21:49:43,336 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/29 21:49:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm297
2015-11-29 21:49:43,356 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/29 21:49:43 INFO retry.RetryInvocationHandler: Exception while invoking getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm297 after 1 fail over attempts. Trying to fail over immediately.
2015-11-29 21:49:43,357 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/29 21:49:43 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm280
2015-11-29 21:49:43,359 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - 15/11/29 21:49:43 INFO retry.RetryInvocationHandler: Exception while invoking getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm280 after 2 fail over attempts. Trying to fail over after sleeping for 477ms.
2015-11-29 21:49:43,359 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - java.net.ConnectException: Call From hd-master-001/10.96.30.51 to hd-master-001:8032 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
2015-11-29 21:49:43,359 INFO [stderr-redir-1]: client.SparkClientImpl (SparkClientImpl.java:run(569)) - at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
My yarn's status is that hd-master-002 is active resourcemanager and hd-master-001 is backup. 8032 port on hd-master-001 is not open. So of course, connection error occurs when trying to connect to hd-master-001's 8032 port.
But why she tried to connect a backup resourcemanager.
If I use hive client command shell on spark on yarn, everything is ok.
PS: I didn't rebuild the spark assembly jar without hive, I only removed 'org.apache.hive' and 'org.apache.hadoop.hive' from built assembly jar. But I do not think it is the problem because I succeeded with hive client on spark on yarn.

Resources