While installing Spark standalone mode on Ubuntu, I am facing issue while running sbt/sbt assembly command, it says No such file or directory found. I did installation from scratch which Covers installation of Java, Scala, Git and finally building Spark using sbt tool. I followed the below tutorial for installations.
https://www.youtube.com/watch?v=eQ0nPdfVfc0
Related
I'm on Windows 7 64-bit and am following this blog to install Spark 2.1.0.
So I tried to build Spark from the sources that I'd cloned from https://github.com/apache/spark to C:\spark-2.1.0.
When I run sbt assembly or sbt -J-Xms2048m -J-Xmx2048m assembly, I get:
[info] Loading project definition from C:\spark-2.1.0\project
[info] Compiling 3 Scala sources to C:\spark-2.1.0\project\target\scala-2.10\sbt-0.13\classes...
java.lang.StackOverflowError
at java.security.AccessController.doPrivileged(Native Method)
at java.io.PrintWriter.<init>(Unknown Source)
at java.io.PrintWriter.<init>(Unknown Source)
at scala.reflect.api.Printers$class.render(Printers.scala:168)
at scala.reflect.api.Universe.render(Universe.scala:59)
at scala.reflect.api.Printers$class.show(Printers.scala:190)
at scala.reflect.api.Universe.show(Universe.scala:59)
at scala.reflect.api.Printers$class.treeToString(Printers.scala:182)
...
I adapted the memory settings of sbt as suggested, which are ignored anyway. Any ideas?
The linked blog post was "Posted on April 29, 2015" that's 2 years old now and should only be read to learn how things have changed since (I'm not even going to link the blog post to stop directing people to the site).
The 2017 way of installing Spark on Windows is as follows:
Download Spark from http://spark.apache.org/downloads.html.
Read the official documentation starting from Downloading.
That's it.
Installing Spark on Windows
Windows is known to give you problems due to Hadoop's requirements (and Spark does use Hadoop API under the covers).
You'll have to install winutils binary that you can find at https://github.com/steveloughran/winutils repository.
TIP: You should select the version of Hadoop the Spark distribution was compiled with, e.g. use hadoop-2.7.1 for Spark 2.1.0.
Save winutils.exe binary to a directory of your choice, e.g. c:\hadoop\bin and define HADOOP_HOME to include c:\hadoop.
See Running Spark Applications on Windows for further steps.
The following settings worked for me (sbtconfig.txt):
# Set the java args to high
-Xmx1024M
-XX:MaxPermSize=2048m
-Xss2M
-XX:ReservedCodeCacheSize=128m
# Set the extra SBT options
-Dsbt.log.format=true
I am a new user to Maven, as I am trying to use it to build apache spark on amazon EC2 VMs. I have mannually installed java version 1.7.0 on the VMs. However as I was running the Maven, the following error occurs:
Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.0:testCompile (scala-test-compile-first) on project spark-core_2.10: Execution scala-test-compile-first of goal net.alchim31.maven:scala-maven-plugin:3.2.0:testCompile failed. CompileFailed
As I think the java version mismatch is the potential reason, causing the compiling problem. I opened up the pom file of the spark for maven tool, it has declared java related version in two seperate places:
<java.version>1.6</java.version>
and
<aws.java.sdk.version>1.8.3</aws.java.sdk.version>
What are the differences between these two versions?
Which one should be edited to solve the jave version mismatch?
It's two different things
<java.version>1.6</java.version>
is the java version used and
<aws.java.sdk.version>1.8.3</aws.java.sdk.version>
is the AWS SDK for Java version used.
The minumum requirement of AWS SDK 1.9 is Java 1.6+ so there is no compatibility issues.
I'm attempting to build Apache Spark 1.1.0 on Windows 8.
I've installed all prerequisites (except Hadoop) and ran sbt/sbt assembly while in the root directory. After downloading many files, I'm getting an error after the line:
Set current project to root <in build file:C:/.../spark-0.9.0-incubating/>". The error is:
[error] Not a valid command: /
[error] /sbt
[error] ^
How to build Spark on Windows?
NOTE Please see my comment about the differences in versions.
The error Not a valid command: / comes from sbt that got executed and attempted to execute a command / (as the first character in /sbt string). It can only mean that you've got sbt shell script available in PATH (possibly installed separately outside the current working directory) or in the current working directory.
Just execute sbt assembly and it should build Spark fine.
According to the main page of Spark:
If you’d like to build Spark from scratch, visit building Spark with Maven.
that clearly states that the official build tool for Spark is now Maven (unfortunately).
You should be able to build a Spark package with the following command:
mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package
It worked fine for me.
I am using Windows 7 OS, I would like to learn Hive and Hadoop. So I installed Ubuntu 13.04 version in My VM Box. When i select download the Hadoop and Hive The below URL having multiple files to download Could you please help me out to install Hive in Ubuntu box else Is there any other steps do you have any steps
http://mirror.tcpdiag.net/apache/hadoop/common/hadoop-1.1.2/
hadoop-1.1.2-1.i386.rpm
hadoop-1.1.2-1.i386.rpm.mds
hadoop-1.1.2-1.x86_64.rpm
hadoop-1.1.2-1.x86_64.rpm.mds
hadoop-1.1.2-bin.tar.gz
hadoop-1.1.2-bin.tar.gz.mds
hadoop-1.1.2.tar.gz
hadoop-1.1.2.tar.gz.mds
hadoop_1.1.2-1_i386.deb
hadoop_1.1.2-1_i386.deb.mds
hadoop_1.1.2-1_x86_64.deb
hadoop_1.1.2-1_x86_64.deb.mds
Since you are new to both Hadoop and Hive, you are better off going ahead with their .tar.gz archives, IMHO. In case things don't go smooth you don't have to do the entire uninstall and reinstall stuff again and again. Just download hadoop-1.1.2.tar.gz, unzip it, keep the unzipped folder at some convenient location and proceed with the configuration. If you want some help regarding configuration you can visit this post. I have tried to explain the complete procedure with all the details.
Configuring Hive is quite straightforward. Download the .tar.gz file. unpack it just like you did with Hadoop. Then follow the steps shown here.
i386: Compiled for a 32-bit architecture
x86_64: Compiled for a 64-bit architecture
.rpm: Red Hat Package Manager file
.deb: Debian Package Manager file
.tar.gz: GZipped archive of the source files
bin.tar.gz: GZipped archive of the compiled source files
.mds: Checksum file
A Linux Package Manager is (sort of) like an installer in Windows. It automatically collects the necessary dependencies. If you download the source files you have to link (and/or compile) all the dependencies yourself.
There you're on Ubuntu, which is a Debian Linux distribution, and you don't seem to have much experience in a Linux environment I would recommend you to download the .deb file for your architecture. Ubuntu will automatically launch the package manager when you launch the .deb file if I remember correctly.
1 .Install Hadoop as single node cluster setup.
2 . Install hive after that.Hive requires Hadoop preinstalled.
Hadoop requires Java 1.6 at least and for single node setup you require SSH installed on your machine.rest of the steps are easy.
goto this link and Download the
http://mirror.tcpdiag.net/apache/hadoop/common/stable/
hadoop-1.1.2.tar.gz file (59M) from link and Install it...same as if you want install hive then go to offical site and download the stable version from it...
I'd like to create a new project with the Play! framework. My system is Mint 12 64bit. Since the newest version of Play! is already bundled with the typesafe-stack, I thought installation would be easy. I added the typesafe repo, then I apt-get updated and apt-get installed typesafe-stack with the command g8 typesafehub/play-scala.
I successfully created a new project in my home folder. Now the problems begin:
I don't know how to access Play! with this installation. After creating the project, I tried to convert it into an Eclipse project, it but there's no play command available in the terminal.
How can I get a "standard" Play! installation on Linux? What happens to the tools bundled in the typesafe stack - Where do they go?
Use sbt where you would have used play. They are one and the same in reality.