I am currently using spring boot 2.3.0 version to build an apache spark job in java. This job is working fine in my local. I want to deploy this spring boot spark job on Azure databricks(7.2.0). But while deploying spring boot jar on Azure databricks, I am getting following error -
ava.lang.NoSuchMethodError: org.springframework.core.ResolvableType.forInstance(Ljava/lang/Object;)Lorg/springframework/core/ResolvableType;
at org.springframework.context.event.SimpleApplicationEventMulticaster.resolveDefaultEventType(SimpleApplicationEventMulticaster.java:145)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:127)
at org.springframework.boot.context.event.EventPublishingRunListener.starting(EventPublishingRunListener.java:74)
at org.springframework.boot.SpringApplicationRunListeners.starting(SpringApplicationRunListeners.java:47)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:305)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1237)
I have checked azure databricks documentation, it has spring core 4.1.3 installed by deafult on azure databricks and in my code the spring core version is 5.2.8. So, I want to ask if there is any way I can upgrade spring core version on azure databricks.
To make third-party or locally-built code available to notebooks and jobs running on your clusters, you can install a library. Libraries can be written in Python, Java, Scala, and R. You can upload Java, Scala, and Python libraries and point to external packages in PyPI, Maven, and CRAN repositories
Steps to install Spring version in Azure Databricks:
Step1: Download the spring core library from the Maven repository. Click on jar file to download.
Step2: Choose the cluster in which you want to install the library.
Libraries => Install New => Library Source: "Upload", Library Type: "Jar", Click on drop here and choose the previously downloaded jar file => Click install.
Successfully, installed spring_core_5_2_8 library on the cluster.
For different methods to install packages in Azure Databricks, refer: How to install a library on a databricks cluster using some command in the notebook?
Related
We build our web app with Azure DevOps pipelines and deploy into Azure with an Azure DevOps release. I think today netcore got updated to netcore 3.1.4 on our build agent. But now our Azure DevOps deployment fails, because the netcore 3.1.4 runtime is not yet installed on our app service in Azure.
The error message we are getting:
Could not find 'aspnetcorev2_inprocess.dll'. Exception message:
It was not possible to find any compatible framework version
The framework 'Microsoft.AspNetCore.App', version '3.1.4' was not found.
- The following frameworks were found:
2.2.8 at [D:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
3.0.3 at [D:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
3.1.1 at [D:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
3.1.3 at [D:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
You can resolve the problem by installing the specified framework and/or SDK.
This makes sense and can happen, but what is the best way to go about fixing this?
I could fix my build to a specific netcore version. But I don't like this, because we do want to keep updating to newer versions, but we don't want a version that is not available in Azure app service.
Am I correct in thinking that we would have to install our services self contained, because otherwise we could get into this issue more often when Azure DevOps is faster with installing patches than Azure?
Or is there a way to force update Azure app service to the new netcore 3.1.4 security patch which would be ideal I think?
I just need some guidance in what is the best approach to fix this issue?
Or is there a way to force update Azure app service to the new netcore 3.1.4 security patch which would be ideal I think?
AFAIK, there is no such a way to force update Azure app service to the new netcore 3.1.4.
We could keep track on the latest releases on the https://aspnetcoreon.azurewebsites.net/, but we could not update it at this moment.
To resolve this issue, we recommend that you publish your app as self-contained produces an application, which includes the .NET Core runtime and libraries, and your application and its dependencies. Users of the application can run it on a machine that doesn't have the .NET Core runtime installed.
Publishing your app as self-contained produces a platform-specific executable. The output publishing folder contains all components of the app, including the .NET Core libraries and target runtime. The app is isolated from other .NET Core apps and doesn't use a locally installed shared runtime. The user of your app isn't required to download and install .NET Core.
You could check this document .NET Core application publishing overview for some more details.
Hope this helps.
If you want version of netcore to be automatically updated as an updated version is available, building our service as self-container seems like a good option: no need to have anything installed on the machine running (ie the version on Azure DevOps and Azure Web App don't have to match).
The main downside of this approach is that the build is going to less deterministic: running your build twice with on the same commit might create different binaries depending on what is currently install on the build agent. if you want to know more, here is an interesting post arguing about why deterministic build is important.
To keep the build determinitic, you can use the Use .Net Core task at the beginning of the build (that will make sure that the desired version of the dotnet sdk is on the agent). You could also add a global.json in your repository to lock for both the build on your dev box and in Azure Dev Ops.
This is a common topic of discussion, and you can find a lot of blogs advocating one or another side.
There were big discussions started when Microsoft released LTS net core 3.1 and it took some time before Azure start supporting the 3.1 runtime as well.
You could find a lot of blogs strongly suggesting to deploy your web apps as self-contained (runtime is ~100MB in size) and cut loose the dependency towards Microsoft supporting the latest runtime. While others advocate that the applications should remain as light weight as possible and the runtime should be set in the pipeline. But that is still up on you. I, myself prefer to deploy self-contained apps after my bad experience with net-core 3.1.
There is no established best practice.
In the past , I've run into the same situation, you can fix this by manually setting the value from RunTime Stack drop down. If you manually update the build processes .yml file
RuntimeStack: 'DOTNETCORE|3.1'
I was creating a recommendation engine in IBM Watson studio for that I needed to add spark service but now it is deprecated what I should use now.
You should utilize spark environments for your Watson studio project.
You can define that spark environment using Environments tab in project and then utilize that runtime when you create notebook or change service for existing notebook.
#Veer as #charles said Spark services are now accessible via Environments. When creating a notebook select a Spark compatible environment and import pyspark in your notebook.
I am planning to execute spark from KNIME analytics platform. For this I need to install KNIME spark executors in the KNIME analytics platform.
Can any one please let me know how to install KNIME spark executors in the KNIME analytics platform for hadoop distribution CDH 5.10.X.
I am referring the installation guide from the below link:
https://www.knime.org/knime-spark-executor
I could successfully configure/integrate spark in KNIME.
I did it in CDH 5.7.
I followed the following steps:
1.Downloaded knime-full_3.3.2.linux.gtk.x86_64.tar.gz.
2.Exract the above mentioned pacakge and run installation for KNIME.
3.After KNIME is installed goto File ->Install KNIME Extensions -> Install Bigdata extensions(Check all the Spark related extensions and proceed).
Follow this link:
https://tech.knime.org/installation-instructions#download
4.Till now only the Bigdata related extensions have been installed but they need license to be functional.
5.License needs to be purchased.However,free trail for 30 days can be availed after which it needs to be purchased.
Folow this link :
https://www.knime.org/knime-spark-executor
6.After plugins are installed we need to configure Spark-job-server.
For that we need to download the compatible version of spark-job-server for the hadoop version we have.
Folow this link for version of spark-job-server and its compatible version :
https://www.knime.org/knime-spark-executor
I'm pretty sure it's as easy as registering for the free trial (and buying the license for longer than 30 days) and then installing the software from the Help->Install New Software menu.
As of version KNIME 3.6 (latest), it should be possible to connect to Spark via Livy, no specific executor deployment on a KNIME Server. Still in preview, but it should do it.
https://www.knime.com/whats-new-in-knime-36
I want to upgrade my SPark component to 2.1.0 from its default 2.0.x.2.5 in Ambari.
I am using HDP 2.5.0 with Ambari 2.4.2.
Appreciate any idea to achieve this.
HDP 2.5 shipped with a technical preview of Spark 2.0, and also Spark 1.6.x. If you do not want to use either of those versions and you want Ambari to manage the service for you then you will need to write a custom service for the Spark version that you want. If you don't want ambari to manage the Spark instance you can follow similar instructions as provided on the Hortonworks Community Forum to manually install Spark 2.x without management.
Newer versions of Ambari (maybe 3.0) probably will support per-component upgrade/multiple component versions
See https://issues.apache.org/jira/browse/AMBARI-12556 for details.
I wanted to contribute to spark.
I cloned the git repository locally. Please suggest how to setup spark first and then run a hello world over it from IDE itself.
For importing/building Spark in IntelliJ or Eclipse follow this guide.
If you are interested in contributing to Spark visit this wiki page for more information:
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
I assume you already have the latest release of Scala IDE (4.0 at this point) from scala-ide.org.
export projects using sbt eclipse, I guess you figured that out already.
import all projects in your workspace (Import Existing projects)
you will probably see a number of errors related to "cross-compiled libraries"
If you want to develop on Scala 2.10, you need to configure a Scala installation for the exact Scala version that’s used to compile Spark. At the time of this writing that is Scala 2.10.4.
you can do that in Eclipse Preferences -> Scala -> Installations by pointing to the lib/ directory of your Scala 2.10.4 distribution.
select all Spark projects and right-click, choose Scala -> Set Scala Installation and point to the 2.10.4 installation. This should clear all errors about invalid cross-compiled libraries.
a clean build should succeed.
You can easily find examples on getting started with Spark, for example here. You can run a Spark app using right-click -> Run As Scala Application.