apache shark installation on spark cluster - apache-spark

When running shark on spark cluster with one node
I'm getting the following error. can anyone please solve it...
Thanks in advance
error::
Executor updated: app-20140619165031-0000/0 is now FAILED (class java.io.IOException: Cannot run program "/home/trendwise/Hadoop_tools/jdk1.7.0_40/bin/java" (in directory "/home/trendwise/Hadoop_tools/spark/spark-0.9.1-bin-hadoop1/work/app-20140619165031-0000/0"): error=2, No such file or directory)

In my experience "No such file or directory" is often a symptom of some other exception. Usually a "no space left on device" and sometimes "too many files open". Mine the logs for other stack traces and monitor your disk usage and inode usage to confirm.

Related

Spark 2.4 Got an error when resolving hostNames Falling back to /default-rack

Running an application in in client mode, the driver logs are printed with the below info messages, any idea on how to resolve this? Any spark configs to be updated? or missing?
[INFO ][dispatcher-event-loop-29][SparkRackResolver:54] Got an error when resolving hostNames. Falling back to /default-rack for all
The jobs runs fine, this msg is not in the executor logs.
Check this bug:
https://issues.apache.org/jira/browse/SPARK-28005
If you want to suppress this in the logs you can try to add this into your log4j.properties
log4j.logger.org.apache.spark.deploy.yarn.SparkRackResolver=ERROR
This can happen while using spart-submit with master yarn in a deploy mode local (not using --deploy-mode cluster) and the path to topology.py script is not correct into your core-site.xml.
Path to core-site.xml can be set via environment variable HADOOP_CONF_DIR (or YARN_CONF_DIR).
Check the path in the param net.topology.script.file.name value of core-site.xml.
If the path is incorrect, deploying driver in local mode will lead to error of executing with the following warning:
23/01/15 18:39:43 WARN ScriptBasedMapping: Exception running /home/alexander/xxx/.conf/topology.py 10.15.21.199
java.io.IOException: Cannot run program "/etc/hadoop/conf.cloudera.yarn/topology.py" (in directory "/home/john"): error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
...
23/01/15 18:39:43 INFO SparkRackResolver: Got an error when resolving hostNames. Falling back to /default-rack for all

SaveAsTextFile() results in Mkdirs failed - Apache Spark

Good, I currently have a cluster in spark with 3 working nodes. I also have a nfs server mounted on /var/nfs with 777 permission for testing. I'm trying to run the following code to count the words in a text:
root#master:/home/usuario# MASTER="spark://10.0.0.1:7077" spark-shell
val inputFile = sc.textFile("/var/nfs/texto.txt")
val counts = inputFile.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
counts.toDebugString
counts.cache()
counts.count()
counts.saveAsTextFile("/home/usuario/output");
But spark gives me the following error:
Caused by: java.io.IOException: Mkdirs failed to create
file:/var/nfs/output-4/_temporary/0/_temporary/attempt_20170614094558_0007_m_000000_20
(exists=false, cwd=file:/opt/spark/work/app-20170614093824-0005/2)
I have searched for many websites but I can not find the solution for my case. All help is grateful.
When you start a spark-shell with MASTER as valid application-master url - and not local[*], spark treats all paths as HDFS; and performs IO operations only in underlying HDFS; not in local.
YOu have mounted the locations in local file-system; and those paths are not existed in HDFS.
That's why, the error says: exists=false
Same issue with me. Check ownership of your directory again.
sudo chown -R owner-user:owner-group directory

i have just begun to use android studio and i cant seem to get my gradle to sync with my application. here is what it shows :

7:46:20 PM Gradle sync started
7:46:35 PM Gradle sync failed: Unable to start the daemon process.
This problem might be caused by incorrect configuration of the daemon.
For example, an unrecognized jvm option is used.
Please refer to the user guide chapter on the daemon at https://docs.gradle.org/2.10/userguide/gradle_daemon.html
Please read the following process output to find out more:
-----------------------
Error occurred during initialization of VM
Could not reserve enough space for object heap
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Consult IDE log for more details (Help | Show Log)
the jvm version is 1.7.0_79
and the studio version is 2.1.1
Error occurred during initialization of VM Could not reserve enough space for object heap Error: Could not create the Java Virtual Machine.
There's no space available in RAM. To fix go to /android-studio-dir/bin and edit studio.vmoptions and studio64.vmoptions to increment the -Xmx and to reserve more memory to Java. Note that the number of processes active may influence on that.
Probably, the /tmp location is full..
Found this somewhere..
Use df command
df
You should see an output with a line like this:
tmpfs 102400 102312 88 100% /tmp
So to change the size of the tmp file:
sudo mount -o remount,size=2G /tmp
Done! Now, It should work..

Cassandra 2.1.8: Node refuses to start with NPE in removeUnfinishedCompactionLeftovers

I am trying to upgrade a Cassandra 2.1.0 cluster to 2.1.8 (latest release).
When I start a first node with 2.1.8 runtime, I get an error and the node refuses to start.
This is the error's stack trace :
org.apache.cassandra.io.FSReadError: java.lang.NullPointerException
at org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:642) ~[apache-cassandra-2.1.8.jar:2.1.8]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:302) [apache-cassandra-2.1.8.jar:2.1.8]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:524) [apache-cassandra-2.1.8.jar:2.1.8]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:613) [apache-cassandra-2.1.8.jar:2.1.8]
Caused by: java.lang.NullPointerException: null
at org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:634) ~[apache-cassandra-2.1.8.jar:2.1.8]
... 3 common frames omitted
FSReadError in Failed to remove unfinished compaction leftovers (file: /home/nudgeca2/datas/data/main/segment-97b5ba00571011e49a928bffe429b6b5/main-segment-ka-15432-Statistics.db). See log for details.
at org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:642)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:302)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:524)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:613)
Caused by: java.lang.NullPointerException
at org.apache.cassandra.db.ColumnFamilyStore.removeUnfinishedCompactionLeftovers(ColumnFamilyStore.java:634)
... 3 more
Exception encountered during startup: java.lang.NullPointerException
The cluster has 7 nodes and it turns on AWS Linux EC2 instances.
The node I try to upgrade was stopped after a nodetool drain.
Then I tried to come back to 2.1.0 runtime but I now get a similar error.
I also tried to stop and start another node and everything was ok, the node restarted without any problem.
I tried to touch the missing file (as it should be removed, I thought it would perhaps not need a specific content). I had two other files with the same error that I also touched. And finally the node fails further while trying to read these files.
Anyone has any idea what I should do ?
Thank you for any help.
It might be worth opening a Jira for that issue, so if nothing else, they can catch the NPE and provide a better error message.
It looks like it's trying to open:
file: /home/nudgeca2/datas/data/main/segment-97b5ba00571011e49a928bffe429b6b5/main-segment-ka-15432-Statistics.db
It's possible that it's trying to read that file because it finds the associated data file: (/home/nudgeca2/datas/data/main/segment-97b5ba00571011e49a928bffe429b6b5/main-segment-ka-15432-Data.db). Does that data file exist? I'd be tempted to move it out of the way, and see if it starts properly.

Cassandra unable to start with error too many files open

Cassandra unable to start with error too many files open. (apache-cassandra-1.2.4)
error file contains:
ERROR 11:53:11,893 Exception encountered during startup
java.lang.RuntimeException: java.io.FileNotFoundException: /home/analysis.engine/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-ib-289887-Data.db (Too many open files)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:46)
at org.apache.cassandra.io.util.CompressedSegmentedFile.createReader(CompressedSegmentedFile.java:57)
at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:41)
at org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:976)
at org.apache.cassandra.db.columniterator.SimpleSliceReader.<init>(SimpleSliceReader.java:61)
at org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:68)
at org.apache.cassandra.db.columniterator.SSTableSliceIterator.<init>(SSTableSliceIterator.java:44)
at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:101)
at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:274)
at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
at org.apache.cassandra.db.DefsTable.serializedColumnFamilies(DefsTable.java:274)
at org.apache.cassandra.db.DefsTable.loadFromTable(DefsTable.java:155)
at org.apache.cassandra.config.DatabaseDescriptor.loadSchemas(DatabaseDescriptor.java:563)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:229)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:411)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:454)
Caused by: java.io.FileNotFoundException: /home/analysis.engine/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-ib-289887-Data.db (Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
at org.apache.cassandra.io.util.RandomAccessReader.<init>(RandomAccessReader.java:67)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.<init>(CompressedRandomAccessReader.java:75)
at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:42)
... 19 more
Please help me to find out the problem and it's solution.
#N D Thokare
Do you use the new Java driver from datastax to connect to Cassandra ?
I had the same issue recently, and the reason was that I bootstrapped many Session at the same time.
Please use root permission and set:
ulimit -n 1000000

Resources