Nutch 1.12 exception java.io.IOException: No FileSystem for scheme: http

Nutch 1.12 exception java.io.IOException: No FileSystem for scheme: http - nutch

I followed https://wiki.apache.org/nutch/NutchTutorial and trying to install and integrate Nutch 1.12 with Solr 5.5.2. I installed Nutch by following steps mentioned in tutorial but when trying to integrate with solr by running below command. It is throwing the below exception.
bin/nutch index http://10.209.18.213:8983/solr crawl/crawldb/ -linkdb crawl/linkdb/ crawl/segments/* -filter -normalize
Exception
2016-08-11 09:18:40,076 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-08-11 09:18:40,383 WARN segment.SegmentChecker - The input path at crawldb is not a segment... skipping
2016-08-11 09:18:40,397 INFO segment.SegmentChecker - Segment dir is complete: crawl/segments/20160810110110.
2016-08-11 09:18:40,403 INFO segment.SegmentChecker - Segment dir is complete: crawl/segments/20160810112551.
2016-08-11 09:18:40,408 INFO segment.SegmentChecker - Segment dir is complete: crawl/segments/20160810112952.
2016-08-11 09:18:40,409 INFO indexer.IndexingJob - Indexer: starting at 2016-08-11 09:18:40
2016-08-11 09:18:40,415 INFO indexer.IndexingJob - Indexer: deleting gone documents: false
2016-08-11 09:18:40,415 INFO indexer.IndexingJob - Indexer: URL filtering: true
2016-08-11 09:18:40,415 INFO indexer.IndexingJob - Indexer: URL normalizing: true
2016-08-11 09:18:40,672 INFO indexer.IndexWriters - Adding org.apache.nutch.indexwriter.solr.SolrIndexWriter
2016-08-11 09:18:40,672 INFO indexer.IndexingJob - Active IndexWriters :
SOLRIndexWriter
solr.server.url : URL of the SOLR instance
solr.zookeeper.hosts : URL of the Zookeeper quorum
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : username for authentication
solr.auth.password : password for authentication
2016-08-11 09:18:40,677 INFO indexer.IndexerMapReduce - IndexerMapReduce: crawldb: http://10.209.18.213:8983/solr
2016-08-11 09:18:40,677 INFO indexer.IndexerMapReduce - IndexerMapReduce: linkdb: crawl/linkdb
2016-08-11 09:18:40,677 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20160810110110
2016-08-11 09:18:40,683 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20160810112551
2016-08-11 09:18:40,684 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20160810112952
2016-08-11 09:18:41,362 ERROR indexer.IndexingJob - Indexer: java.io.IOException: No FileSystem for scheme: http
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2385)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:256)
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228)
at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:304)
at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:833)
at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)

The tutorial still mentions the deprecated solrindex command. The index command should be
bin/nutch index -Dsolr.server.url=http://.../solr crawldb/ -linkdb linkdb/ segments/*
Without argument Nutch commands show a command-line help:
bin/nutch index
Usage: Indexer <crawldb> [-linkdb <linkdb>] [-params k1=v1&k2=v2...] (<segment> ... | -dir <segments>) [-noCommit] [-deleteGone] [-filter] [-normalize] [-addBinaryContent] [-base64]
Active IndexWriters :
SOLRIndexWriter
solr.server.url : URL of the SOLR instance
solr.zookeeper.hosts : URL of the Zookeeper quorum
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : username for authentication
solr.auth.password : password for authentication

Related

AnalysisException: Table or view not found --- Even after I create a view using "createGlobalTempView" , how to fix?

I am using spark-sql-2.4.1v to streaming in my PoC.
I am trying to do join by registering dataframes as table.
For which I am using createGlobalTempView and doing as below
first_df.createGlobalTempView("first_tab");
second_df.createGlobalTempView("second_tab");
Dataset<Row> joinUpdatedRecordsDs = sparkSession.sql("select a.* , b.create_date, b.last_update_date from first_tab as a "
+ " inner join second_tab as b "
+ " on a.company_id = b.company_id "
);
ERROR org.apache.spark.sql.AnalysisException: Table or view not
found: first_tab; line 1 pos 105
What wrong I am doing here ? how to fix this ?
Some more info
On my spark session I ".enableHiveSupport()" set.
When I see logs I found these traces
19/09/13 12:40:45 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
19/09/13 12:40:45 INFO HiveMetaStore: 0: get_table : db=default tbl=first_tab
19/09/13 12:40:45 INFO audit: ugi=userrw ip=unknown-ip-addr cmd=get_table : db=default tbl=first_tab
19/09/13 12:40:45 INFO HiveMetaStore: 0: get_table : db=default tbl=second_tab
19/09/13 12:40:45 INFO audit: ugi=userrw ip=unknown-ip-addr cmd=get_table : db=default tbl=second_tab
19/09/13 12:40:45 INFO HiveMetaStore: 0: get_database: default
19/09/13 12:40:45 INFO audit: ugi=userrw ip=unknown-ip-addr cmd=get_database: default
19/09/13 12:40:45 INFO HiveMetaStore: 0: get_database: default
19/09/13 12:40:45 INFO audit: ugi=userrw ip=unknown-ip-addr cmd=get_database: default
19/09/13 12:40:45 INFO HiveMetaStore: 0: get_tables: db=default pat=*
19/09/13 12:40:45 INFO audit: ugi=userrw ip=unknown-ip-addr cmd=get_tables: db=default pat=*
System.out.println("first_tab exists : " + sparkSession.catalog().tableExists("first_tab"));
System.out.println("second_tab exists : " + sparkSession.catalog().tableExists("second_tab"));
Output
first_tab exists : false
second_tab exists : false
I tried to print the tables in the db as below but nothing prints.
sparkSession.catalog().listTables().foreach( tab -> {
System.out.println("tab.database :" + tab.database());
System.out.println("tab.name :" + tab.name());
System.out.println("tab.tableType :" + tab.tableType());
});
No output printed , therefore we may say no table created.
I tried to create tables with "global_temp." but throws error
org.apache.spark.sql.AnalysisException: It is not allowed to add database prefix `global_temp` for the TEMPORARY view name.;
at org.apache.spark.sql.execution.command.CreateViewCommand.<init>(views.scala:122)
I tried to refer table with appending "global_temp." but throws same above error
i.e
System.out.println("first_tab exists : " + sparkSession.catalog().tableExists("global_temp.first_tab"));
System.out.println("second_tab exists : " + sparkSession.catalog().tableExists("global_temp.second_tab"));
same above error

These global views live in the database with the name global_temp so i would recommend to reference the tables in your queries as global_temp.table_name. I am not sure if it solves your problem, but you can try it.
From the Spark source code:
Global temporary view is cross-session. Its lifetime is the lifetime of the Spark application, i.e. it will be automatically dropped when the application terminates. It's tied to a system preserved database global_temp, and we must use the qualified name to refer a global temp view, e.g. SELECT * FROM global_temp.view1.

Remove .enableHiveSupport() while creating a session. This would work fine.
SparkSession spark = SparkSession
.builder()
.appName("DatabaseMigrationUtility")
//.enableHiveSupport()
.getOrCreate();
As David mentioned use global_temp. to refer the table.

When I write a DataFrame to a Parquet file, no errors are shown and no file is created

Hi everybody I have a problem while saving a DataFrame. I found a similar unanswered question: Saving Spark dataFrames as parquet files - no errors, but data is not being saved. My problem is that when I tested the following code:
scala> import org.apache.spark.ml.linalg.Vectors
import org.apache.spark.ml.linalg.Vectors
scala> val dataset = spark.createDataFrame(
| Seq((0, 18, 1.0, Vectors.dense(0.0, 10.0, 0.5), 1.0))
| ).toDF("id", "hour", "mobile", "userFeatures", "clicked")
dataset: org.apache.spark.sql.DataFrame = [id: int, hour: int ... 3 more fields]
scala> dataset.show
+---+----+------+--------------+-------+
| id|hour|mobile| userFeatures|clicked|
+---+----+------+--------------+-------+
| 0| 18| 1.0|[0.0,10.0,0.5]| 1.0|
+---+----+------+--------------+-------+
scala> dataset.write.parquet("/home/vitrion/out")
No errors has been shown and seems that the DF has been saved as Parquet file. Surprisingly, no file has been created in the output directory.
This is my cluster configuration
The logfile says:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/03/01 12:56:53 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 51016#t630-0
18/03/01 12:56:53 INFO SignalUtils: Registered signal handler for TERM
18/03/01 12:56:53 INFO SignalUtils: Registered signal handler for HUP
18/03/01 12:56:53 INFO SignalUtils: Registered signal handler for INT
18/03/01 12:56:53 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/03/01 12:56:54 WARN Utils: Your hostname, t630-0 resolves to a loopback address: 127.0.1.1; using 192.168.239.218 instead (on interface eno1)
18/03/01 12:56:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
18/03/01 12:56:54 INFO SecurityManager: Changing view acls to: vitrion
18/03/01 12:56:54 INFO SecurityManager: Changing modify acls to: vitrion
18/03/01 12:56:54 INFO SecurityManager: Changing view acls groups to:
18/03/01 12:56:54 INFO SecurityManager: Changing modify acls groups to:
18/03/01 12:56:54 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(vitrion); groups with view permissions: Set(); users with modify permissions: Set(vitrion); groups with modify permissions: Set()
18/03/01 12:56:54 INFO TransportClientFactory: Successfully created connection to /192.168.239.54:42629 after 80 ms (0 ms spent in bootstraps)
18/03/01 12:56:54 INFO SecurityManager: Changing view acls to: vitrion
18/03/01 12:56:54 INFO SecurityManager: Changing modify acls to: vitrion
18/03/01 12:56:54 INFO SecurityManager: Changing view acls groups to:
18/03/01 12:56:54 INFO SecurityManager: Changing modify acls groups to:
18/03/01 12:56:54 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(vitrion); groups with view permissions: Set(); users with modify permissions: Set(vitrion); groups with modify permissions: Set()
18/03/01 12:56:54 INFO TransportClientFactory: Successfully created connection to /192.168.239.54:42629 after 2 ms (0 ms spent in bootstraps)
18/03/01 12:56:54 INFO DiskBlockManager: Created local directory at /tmp/spark-d749d72b-6db2-4f02-8dae-481c0ea1f68f/executor-f379929a-3a6a-4366-8983-b38e19fb9cfc/blockmgr-c6d89ef4-b22a-4344-8816-23306722d40c
18/03/01 12:56:54 INFO MemoryStore: MemoryStore started with capacity 8.4 GB
18/03/01 12:56:54 INFO CoarseGrainedExecutorBackend: Connecting to driver: spark://CoarseGrainedScheduler#192.168.239.54:42629
18/03/01 12:56:54 INFO WorkerWatcher: Connecting to worker spark://Worker#192.168.239.218:45532
18/03/01 12:56:54 INFO TransportClientFactory: Successfully created connection to /192.168.239.218:45532 after 1 ms (0 ms spent in bootstraps)
18/03/01 12:56:54 INFO WorkerWatcher: Successfully connected to spark://Worker#192.168.239.218:45532
18/03/01 12:56:54 INFO CoarseGrainedExecutorBackend: Successfully registered with driver
18/03/01 12:56:54 INFO Executor: Starting executor ID 2 on host 192.168.239.218
18/03/01 12:56:54 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 37178.
18/03/01 12:56:54 INFO NettyBlockTransferService: Server created on 192.168.239.218:37178
18/03/01 12:56:54 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/03/01 12:56:54 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(2, 192.168.239.218, 37178, None)
18/03/01 12:56:54 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(2, 192.168.239.218, 37178, None)
18/03/01 12:56:54 INFO BlockManager: Initialized BlockManager: BlockManagerId(2, 192.168.239.218, 37178, None)
18/03/01 12:56:54 INFO Executor: Using REPL class URI: spark://192.168.239.54:42629/classes
18/03/01 12:57:54 INFO CoarseGrainedExecutorBackend: Got assigned task 0
18/03/01 12:57:54 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
18/03/01 12:57:54 INFO TorrentBroadcast: Started reading broadcast variable 0
18/03/01 12:57:55 INFO TransportClientFactory: Successfully created connection to /192.168.239.54:35081 after 1 ms (0 ms spent in bootstraps)
18/03/01 12:57:55 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 28.1 KB, free 8.4 GB)
18/03/01 12:57:55 INFO TorrentBroadcast: Reading broadcast variable 0 took 103 ms
18/03/01 12:57:55 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 76.6 KB, free 8.4 GB)
18/03/01 12:57:55 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
18/03/01 12:57:55 INFO SQLHadoopMapReduceCommitProtocol: Using user defined output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
18/03/01 12:57:55 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
18/03/01 12:57:55 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.parquet.hadoop.ParquetOutputCommitter
18/03/01 12:57:55 INFO CodecConfig: Compression: SNAPPY
18/03/01 12:57:55 INFO CodecConfig: Compression: SNAPPY
18/03/01 12:57:55 INFO ParquetOutputFormat: Parquet block size to 134217728
18/03/01 12:57:55 INFO ParquetOutputFormat: Parquet page size to 1048576
18/03/01 12:57:55 INFO ParquetOutputFormat: Parquet dictionary page size to 1048576
18/03/01 12:57:55 INFO ParquetOutputFormat: Dictionary is on
18/03/01 12:57:55 INFO ParquetOutputFormat: Validation is off
18/03/01 12:57:55 INFO ParquetOutputFormat: Writer version is: PARQUET_1_0
18/03/01 12:57:55 INFO ParquetOutputFormat: Maximum row group padding size is 0 bytes
18/03/01 12:57:55 INFO ParquetOutputFormat: Page size checking is: estimated
18/03/01 12:57:55 INFO ParquetOutputFormat: Min row count for page size check is: 100
18/03/01 12:57:55 INFO ParquetOutputFormat: Max row count for page size check is: 10000
18/03/01 12:57:55 INFO ParquetWriteSupport: Initialized Parquet WriteSupport with Catalyst schema:
{
"type" : "struct",
"fields" : [ {
"name" : "id",
"type" : "integer",
"nullable" : false,
"metadata" : { }
}, {
"name" : "hour",
"type" : "integer",
"nullable" : false,
"metadata" : { }
}, {
"name" : "mobile",
"type" : "double",
"nullable" : false,
"metadata" : { }
}, {
"name" : "userFeatures",
"type" : {
"type" : "udt",
"class" : "org.apache.spark.ml.linalg.VectorUDT",
"pyClass" : "pyspark.ml.linalg.VectorUDT",
"sqlType" : {
"type" : "struct",
"fields" : [ {
"name" : "type",
"type" : "byte",
"nullable" : false,
"metadata" : { }
}, {
"name" : "size",
"type" : "integer",
"nullable" : true,
"metadata" : { }
}, {
"name" : "indices",
"type" : {
"type" : "array",
"elementType" : "integer",
"containsNull" : false
},
"nullable" : true,
"metadata" : { }
}, {
"name" : "values",
"type" : {
"type" : "array",
"elementType" : "double",
"containsNull" : false
},
"nullable" : true,
"metadata" : { }
} ]
}
},
"nullable" : true,
"metadata" : { }
}, {
"name" : "clicked",
"type" : "double",
"nullable" : false,
"metadata" : { }
} ]
}
and corresponding Parquet message type:
message spark_schema {
required int32 id;
required int32 hour;
required double mobile;
optional group userFeatures {
required int32 type (INT_8);
optional int32 size;
optional group indices (LIST) {
repeated group list {
required int32 element;
}
}
optional group values (LIST) {
repeated group list {
required double element;
}
}
}
required double clicked;
}
18/03/01 12:57:55 INFO CodecPool: Got brand-new compressor [.snappy]
18/03/01 12:57:55 INFO InternalParquetRecordWriter: Flushing mem columnStore to file. allocated memory: 84
18/03/01 12:57:55 INFO FileOutputCommitter: Saved output of task 'attempt_20180301125755_0000_m_000000_0' to file:/home/vitrion/out/_temporary/0/task_20180301125755_0000_m_000000
18/03/01 12:57:55 INFO SparkHadoopMapRedUtil: attempt_20180301125755_0000_m_000000_0: Committed
18/03/01 12:57:55 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1967 bytes result sent to driver`
Can you please help me to solve this?
Thank you

Have you tried writing without the Vector? I have seen it in the past where complex data structures would cause writing issues.

Spark: Frequent Pattern Mining: issues in saving the results

I am using Spark's FP-growth algorithm. I was getting OOM errors when I was doing a collect, I then changed the code so that I can save the results in a text file on HDFS rather than collecting them on the driver node. Here is the related code:
// Model building:
val fpg = new FPGrowth()
.setMinSupport(0.01)
.setNumPartitions(10)
val model = fpg.run(transaction_distinct)
Here is a transformation that should give me RDD[Strings].
val mymodel = model.freqItemsets.map { itemset =>
val model_res = itemset.items.mkString("[", ",", "]") + ", " + itemset.freq
model_res
}
I then save the model results as. Unfortunately, this is really SLOW!!!
mymodel.saveAsTextFile("fpm_model")
I get these errors:
16/02/04 14:47:28 ERROR ErrorMonitor: AssociationError[akka.tcp://sparkDriver#ipaddress:46811] -> [akka.tcp://sparkExecutor#hostname:39720]: Error [Association failed with [akka.tcp://sparkExecutor#hostname:39720]][akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor#hostname:39720]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: hostname/ipaddress:39720] akka.event.Logging$Error$NoCause$
16/02/04 14:47:28 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(3, hostname, 58683)
16/02/04 14:47:28 INFO BlockManagerMaster: Removed 3 successfully in removeExecutor
16/02/04 14:47:28 ERROR ErrorMonitor: AssociationError [akka.tcp://sparkDriver#ipaddress:46811] ->[akka.tcp://sparkExecutor#hostname:39720]: Error [Association failed with [akka.tcp://sparkExecutor#hostname:39720]][akka.remote.EndpointAssociationException: Association failed with [akka.tcp://sparkExecutor#hostname:39720]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: hostname/ipaddress:39720

How to use a specific directory for metastore with HiveContext?

So this is what I tried in Spark Shell.
scala> import org.apache.spark.sql.hive.HiveContext
import org.apache.spark.sql.hive.HiveContext
scala> import java.nio.file.Files
import java.nio.file.Files
scala> val hiveDir = Files.createTempDirectory("hive")
hiveDir: java.nio.file.Path = /var/folders/gg/g3hk6fcj4rxc6lb1qsvxc_vdxxwf28/T/hive5050481206678469338
scala> val hiveContext = new HiveContext(sc)
15/12/31 12:05:27 INFO HiveContext: Initializing execution hive, version 0.13.1
hiveContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext#6f959640
scala> hiveContext.sql(s"SET hive.metastore.warehouse.dir=${hiveDir.toUri}")
15/12/31 12:05:34 INFO HiveContext: Initializing HiveMetastoreConnection version 0.13.1 using Spark classes.
...
res0: org.apache.spark.sql.DataFrame = [: string]
scala> Seq("create database foo").foreach(hiveContext.sql)
15/12/31 12:05:42 INFO ParseDriver: Parsing command: create database foo
15/12/31 12:05:42 INFO ParseDriver: Parse Completed
...
15/12/31 12:05:43 INFO HiveMetaStore: 0: create_database: Database(name:foo, description:null, locationUri:null, parameters:null, ownerName:aa8y, ownerType:USER)
15/12/31 12:05:43 INFO audit: ugi=aa8y ip=unknown-ip-addr cmd=create_database: Database(name:foo, description:null, locationUri:null, parameters:null, ownerName:aa8y, ownerType:USER)
15/12/31 12:05:43 INFO HiveMetaStore: 0: get_database: foo
15/12/31 12:05:43 INFO audit: ugi=aa8y ip=unknown-ip-addr cmd=get_database: foo
15/12/31 12:05:43 ERROR RetryingHMSHandler: MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database_core(HiveMetaStore.java:734)
...
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/12/31 12:05:43 ERROR DDLTask: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:248)
...
at org.apache.hadoop.hive.ql.metadata.Hive.createDatabase(Hive.java:242)
... 78 more
15/12/31 12:05:43 ERROR Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
15/12/31 12:05:43 ERROR ClientWrapper:
======================
HIVE FAILURE OUTPUT
======================
SET hive.metastore.warehouse.dir=file:///var/folders/gg/g3hk6fcj4rxc6lb1qsvxc_vdxxwf28/T/hive5050481206678469338/
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
======================
END HIVE FAILURE OUTPUT
======================
org.apache.spark.sql.execution.QueryExecutionException: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Unable to create database path file:/user/hive/warehouse/foo.db, failed to create database foo)
at org.apache.spark.sql.hive.client.ClientWrapper$$anonfun$runHive$1.apply(ClientWrapper.scala:349)
...
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
scala>
But it doesn't seem to recognize the directory I am setting. I've removed content from the stacktrace since it was very verbose. The entire stacktrace is here.
I am not sure what I am doing wrong. Would appreciate any help provided.

Selenium Webdriver requires restarts to function consistently

My testing stack consists of the latest version of Selenium Server (2.33.0, aka selenium-server-standalone-2.33.0.jar), Mocha, Node.js, and PhantomJS.
My question regards the following code:
var webdriver = require('../../../lib/selenium/node_modules/selenium-webdriver/'),
driver = new webdriver.Builder().
withCapabilities({'browserName': 'phantomjs'}).
build();
driver.manage().timeouts().implicitlyWait(15000);
describe('Wordpress', function() {
it('should be able to log in', function(done) {
driver.get('http://#### REDACTED ####/wp-login.php');
driver.findElement(webdriver.By.css('#user_login')).sendKeys('#### REDACTED ####');
driver.findElement(webdriver.By.css('#user_pass')).sendKeys('#### REDACTED ####');
driver.findElement(webdriver.By.css('#wp-submit')).click();
// #wpwrap is an element on the Wordpress dashboard that is displayed once
// the user is logged in. By testing for its presence, we can determine
// if the login attempt succeeded.
driver.findElement(webdriver.By.css('#wpwrap')).then(function(v) {
done();
});
});
});
On my local system, OS X, the test runs fine consistently. However, once the test is uploaded to our CentOS server (where we hope to do Continuous Integration testing), the test behaves extremely strangely.
After Selenium Server is started, the test runs successfully once. From that point on, the test only succeeds every one out of ten times or so. Restarting Selenium Server guarantees that the test will run successfully. In fact, if Selenium Server is restarted every time the test is run, the test will succeed every time.
How can I get this test to succeed without restarting Selenium Server every time?
Thank you so much for your help! :)
UPDATE: In addition to the error log below, I'm also occasionally getting the following error:
Exception in thread "Thread-21" java.lang.OutOfMemoryError: unable to create new native thread
Details on the error messages follow below:
A successful test yields the following output from Mocha:
[s5rich#host ~]$ mocha test/selenium/acceptance/simple.js
Wordpress
✓ should be able to log in (2604ms)
1 passing (3 seconds)
A successful test also yields the following output from Selenium Server:
23:21:50.517 INFO - Executing: [new session: {browserName=phantomjs}] at URL: /session)
23:21:50.527 INFO - Creating a new session for Capabilities [{browserName=phantomjs}]
23:21:50.547 INFO - executable: /usr/local/bin/phantomjs
23:21:50.547 INFO - port: 26515
23:21:50.547 INFO - arguments: [--webdriver=26515, --webdriver-logfile=/home/s5rich/phantomjsdriver.log]
23:21:50.547 INFO - environment: {}
PhantomJS is launching GhostDriver...
[INFO - 2013-07-24T05:21:50.923Z] GhostDriver - Main - running on port 26515
[INFO - 2013-07-24T05:21:51.435Z] Session [f235d040-f420-11e2-8d90-f50327bc3449] - CONSTRUCTOR - Desired Capabilities: {"browserName":"phantomjs"}
[INFO - 2013-07-24T05:21:51.435Z] Session [f235d040-f420-11e2-8d90-f50327bc3449] - CONSTRUCTOR - Negotiated Capabilities: {"browserName":"phantomjs","version":"1.9.1","driverName":"ghostdriver","driverVersion":"1.0.3","platform":"linux-unknown-32bit","javascriptEnabled":true,"takesScreenshot":true,"handlesAlerts":false,"databaseEnabled":false,"locationContextEnabled":false,"applicationCacheEnabled":false,"browserConnectionEnabled":false,"cssSelectorsEnabled":true,"webStorageEnabled":false,"rotatable":false,"acceptSslCerts":false,"nativeEvents":true,"proxy":{"proxyType":"direct"}}
[INFO - 2013-07-24T05:21:51.435Z] SessionManagerReqHand - _postNewSessionCommand - New Session Created: f235d040-f420-11e2-8d90-f50327bc3449
23:21:51.495 INFO - Done: /session
23:21:51.504 INFO - Executing: org.openqa.selenium.remote.server.handler.GetSessionCapabilities#46b78d at URL: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2)
23:21:51.505 INFO - Done: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2
23:21:51.520 INFO - Executing: [get: http://#### REDACTED ####/wp-login.php] at URL: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/url)
23:21:51.821 INFO - Done: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/url
23:21:51.827 INFO - Executing: [find element: By.selector: #user_login] at URL: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element)
23:21:51.874 INFO - Done: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element
23:21:51.883 INFO - Executing: [send keys: 0 org.openqa.selenium.support.events.EventFiringWebDriver$EventFiringWebElement#20788bf8, [#### REDACTED ####]] at URL: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element/0/value)
23:21:51.939 INFO - Done: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element/0/value
23:21:51.948 INFO - Executing: [find element: By.selector: #user_pass] at URL: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element)
23:21:51.965 INFO - Done: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element
23:21:52.001 INFO - Executing: [send keys: 1 org.openqa.selenium.support.events.EventFiringWebDriver$EventFiringWebElement#20788bf9, [#### REDACTED ####]] at URL: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element/1/value)
23:21:52.065 INFO - Done: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element/1/value
23:21:52.074 INFO - Executing: [find element: By.selector: #wp-submit] at URL: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element)
23:21:52.099 INFO - Done: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element
23:21:52.106 INFO - Executing: [click: 2 org.openqa.selenium.support.events.EventFiringWebDriver$EventFiringWebElement#20788bfa] at URL: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element/2/click)
23:21:52.842 INFO - Done: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element/2/click
23:21:52.850 INFO - Executing: [find element: By.selector: #wpwrap] at URL: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element)
23:21:52.871 INFO - Done: /session/8750a6b4-ec7d-4313-a86d-04ac344b74f2/element
A failed test yields the following output from Mocha:
[s5rich#host ~]$ mocha test/selenium/acceptance/simple.js
Wordpress
1) should be able to log in
0 passing (2 seconds)
1 failing
1) Wordpress should be able to log in:
Uncaught UnknownError: Error Message => 'Unable to find element with css selector '#wpwrap''
caused by Request => {"headers":{"Accept":"application/json, image/png","Connection":"Keep-Alive","Content-Length":"42","Content-Type":"application/json; charset=utf-8","Host":"localhost:2897"},"httpVersion":"1.1","method":"POST","post":"{\"using\":\"css selector\",\"value\":\"#wpwrap\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/77a0a110-f421-11e2-a6fd-61cd002d7d02/element"}
Build info: version: '2.33.0', revision: '4e90c97', time: '2013-05-22 15:32:38'
System info: os.name: 'Linux', os.arch: 'i386', os.version: '2.6.32-042stab076.8', java.version: '1.7.0'
Driver info: driver.version: unknown
at new bot.Error (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/atoms/error.js:108:18)
at Object.bot.response.checkResponse (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/atoms/response.js:106:9)
at /home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/webdriver.js:262:20
at /home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/goog/base.js:1112:15
at webdriver.promise.ControlFlow.runInNewFrame_ (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:1431:20)
at notify (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:315:12)
at notifyAll (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:284:7)
at fulfill (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:389:7)
at /home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:1298:10
at /home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/goog/base.js:1112:15
at webdriver.promise.ControlFlow.runInNewFrame_ (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:1431:20)
at notify (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:315:12)
at notifyAll (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:284:7)
at fulfill (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:389:7)
at /home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/goog/base.js:1112:15
at webdriver.promise.ControlFlow.runInNewFrame_ (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:1431:20)
at notify (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:315:12)
at notifyAll (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:284:7)
at fulfill (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:389:7)
at /home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/promise.js:600:51
at /home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/http/http.js:96:5
at IncomingMessage.<anonymous> (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/http/index.js:113:7)
at IncomingMessage.EventEmitter.emit (events.js:117:20)
at _stream_readable.js:910:16
at process._tickCallback (node.js:415:13)
==== async task ====
WebDriver.findElement(By.cssSelector("#wpwrap"))
at webdriver.WebDriver.schedule (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/webdriver.js:246:15)
at webdriver.WebDriver.findElement (/home/s5rich/lib/selenium/node_modules_osx/selenium-webdriver/lib/webdriver/webdriver.js:685:17)
at Context.<anonymous> (/home/s5rich/test/selenium/acceptance/simple.js:12:10)
at Test.Runnable.run (/usr/local/lib/node_modules/mocha/lib/runnable.js:194:15)
at Runner.runTest (/usr/local/lib/node_modules/mocha/lib/runner.js:355:10)
at /usr/local/lib/node_modules/mocha/lib/runner.js:401:12
at next (/usr/local/lib/node_modules/mocha/lib/runner.js:281:14)
at /usr/local/lib/node_modules/mocha/lib/runner.js:290:7
at next (/usr/local/lib/node_modules/mocha/lib/runner.js:234:23)
at Object._onImmediate (/usr/local/lib/node_modules/mocha/lib/runner.js:258:5)
at processImmediate [as _immediateCallback] (timers.js:330:15)
A failed test also yields the following output from Selenium Server:
23:25:34.742 INFO - Executing: [new session: {browserName=phantomjs}] at URL: /session)
23:25:34.743 INFO - Creating a new session for Capabilities [{browserName=phantomjs}]
23:25:34.744 INFO - executable: /usr/local/bin/phantomjs
23:25:34.744 INFO - port: 2897
23:25:34.744 INFO - arguments: [--webdriver=2897, --webdriver-logfile=/home/s5rich/phantomjsdriver.log]
23:25:34.744 INFO - environment: {}
PhantomJS is launching GhostDriver...
[INFO - 2013-07-24T05:25:34.879Z] GhostDriver - Main - running on port 2897
[INFO - 2013-07-24T05:25:35.270Z] Session [77a0a110-f421-11e2-a6fd-61cd002d7d02] - CONSTRUCTOR - Desired Capabilities: {"browserName":"phantomjs"}
[INFO - 2013-07-24T05:25:35.270Z] Session [77a0a110-f421-11e2-a6fd-61cd002d7d02] - CONSTRUCTOR - Negotiated Capabilities: {"browserName":"phantomjs","version":"1.9.1","driverName":"ghostdriver","driverVersion":"1.0.3","platform":"linux-unknown-32bit","javascriptEnabled":true,"takesScreenshot":true,"handlesAlerts":false,"databaseEnabled":false,"locationContextEnabled":false,"applicationCacheEnabled":false,"browserConnectionEnabled":false,"cssSelectorsEnabled":true,"webStorageEnabled":false,"rotatable":false,"acceptSslCerts":false,"nativeEvents":true,"proxy":{"proxyType":"direct"}}
[INFO - 2013-07-24T05:25:35.270Z] SessionManagerReqHand - _postNewSessionCommand - New Session Created: 77a0a110-f421-11e2-a6fd-61cd002d7d02
23:25:35.275 INFO - Done: /session
23:25:35.283 INFO - Executing: org.openqa.selenium.remote.server.handler.GetSessionCapabilities#13b4ce4 at URL: /session/b62d0c67-2000-439c-a0bd-f2c100350dee)
23:25:35.284 INFO - Done: /session/b62d0c67-2000-439c-a0bd-f2c100350dee
23:25:35.297 INFO - Executing: [get: http://#### REDACTED ####/wp-login.php] at URL: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/url)
23:25:35.592 INFO - Done: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/url
23:25:35.597 INFO - Executing: [find element: By.selector: #user_login] at URL: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element)
23:25:35.619 INFO - Done: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element
23:25:35.631 INFO - Executing: [send keys: 0 org.openqa.selenium.support.events.EventFiringWebDriver$EventFiringWebElement#240035bc, [#### REDACTED ####]] at URL: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element/0/value)
23:25:35.683 INFO - Done: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element/0/value
23:25:35.695 INFO - Executing: [find element: By.selector: #user_pass] at URL: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element)
23:25:35.712 INFO - Done: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element
23:25:35.723 INFO - Executing: [send keys: 1 org.openqa.selenium.support.events.EventFiringWebDriver$EventFiringWebElement#240035bd, [#### REDACTED ####]] at URL: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element/1/value)
23:25:35.783 INFO - Done: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element/1/value
23:25:35.800 INFO - Executing: [find element: By.selector: #wp-submit] at URL: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element)
23:25:35.822 INFO - Done: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element
23:25:35.832 INFO - Executing: [click: 2 org.openqa.selenium.support.events.EventFiringWebDriver$EventFiringWebElement#240035be] at URL: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element/2/click)
23:25:36.105 INFO - Done: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element/2/click
23:25:36.121 INFO - Executing: [find element: By.selector: #wpwrap] at URL: /session/b62d0c67-2000-439c-a0bd-f2c100350dee/element)
e = java.awt.HeadlessException:
No X11 DISPLAY variable was set, but this program performed an operation which requires it.
23:25:36.285 WARN - Exception thrown
org.openqa.selenium.NoSuchElementException: Error Message => 'Unable to find element with css selector '#wpwrap''
caused by Request => {"headers":{"Accept":"application/json, image/png","Connection":"Keep-Alive","Content-Length":"42","Content-Type":"application/json; charset=utf-8","Host":"localhost:2897"},"httpVersion":"1.1","method":"POST","post":"{\"using\":\"css selector\",\"value\":\"#wpwrap\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/77a0a110-f421-11e2-a6fd-61cd002d7d02/element"}
Command duration or timeout: 149 milliseconds
For documentation on this error, please visit: http://seleniumhq.org/exceptions/no_such_element.html
Build info: version: '2.33.0', revision: '4e90c97', time: '2013-05-22 15:32:38'
System info: os.name: 'Linux', os.arch: 'i386', os.version: '2.6.32-042stab076.8', java.version: '1.7.0'
Session ID: 77a0a110-f421-11e2-a6fd-61cd002d7d02
Driver info: org.openqa.selenium.phantomjs.PhantomJSDriver
Capabilities [{platform=LINUX, acceptSslCerts=false, javascriptEnabled=true, browserName=phantomjs, rotatable=false, driverVersion=1.0.3, locationContextEnabled=false, version=1.9.1, databaseEnabled=false, cssSelectorsEnabled=true, handlesAlerts=false, browserConnectionEnabled=false, proxy={proxyType=direct}, webStorageEnabled=false, nativeEvents=true, driverName=ghostdriver, applicationCacheEnabled=false, takesScreenshot=true}]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at org.openqa.selenium.remote.ErrorHandler.createThrowable(ErrorHandler.java:191)
at org.openqa.selenium.remote.ErrorHandler.throwIfResponseFailed(ErrorHandler.java:145)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:554)
at org.openqa.selenium.remote.RemoteWebDriver.findElement(RemoteWebDriver.java:307)
at org.openqa.selenium.remote.RemoteWebDriver.findElementByCssSelector(RemoteWebDriver.java:396)
at org.openqa.selenium.By$ByCssSelector.findElement(By.java:407)
at org.openqa.selenium.remote.RemoteWebDriver.findElement(RemoteWebDriver.java:299)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.openqa.selenium.support.events.EventFiringWebDriver$2.invoke(EventFiringWebDriver.java:101)
at $Proxy1.findElement(Unknown Source)
at org.openqa.selenium.support.events.EventFiringWebDriver.findElement(EventFiringWebDriver.java:180)
at org.openqa.selenium.remote.server.handler.FindElement.call(FindElement.java:47)
at org.openqa.selenium.remote.server.handler.FindElement.call(FindElement.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at org.openqa.selenium.remote.server.DefaultSession$1.run(DefaultSession.java:169)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.openqa.selenium.remote.ScreenshotException: Screen shot has been taken
Build info: version: '2.33.0', revision: '4e90c97', time: '2013-05-22 15:32:38'
System info: os.name: 'Linux', os.arch: 'i386', os.version: '2.6.32-042stab076.8', java.version: '1.7.0'
Driver info: driver.version: EventFiringWebDriver
at org.openqa.selenium.remote.ErrorHandler.throwIfResponseFailed(ErrorHandler.java:125)
... 20 more
Caused by: org.openqa.selenium.remote.ErrorHandler$UnknownServerException: Error Message => 'Unable to find element with css selector '#wpwrap''
caused by Request => {"headers":{"Accept":"application/json, image/png","Connection":"Keep-Alive","Content-Length":"42","Content-Type":"application/json; charset=utf-8","Host":"localhost:2897"},"httpVersion":"1.1","method":"POST","post":"{\"using\":\"css selector\",\"value\":\"#wpwrap\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/77a0a110-f421-11e2-a6fd-61cd002d7d02/element"}
Build info: version: '2.33.0', revision: '4e90c97', time: '2013-05-22 15:32:38'
System info: os.name: 'Linux', os.arch: 'i386', os.version: '2.6.32-042stab076.8', java.version: '1.7.0'
Driver info: driver.version: unknown
23:25:36.293 WARN - Exception: Error Message => 'Unable to find element with css selector '#wpwrap''
caused by Request => {"headers":{"Accept":"application/json, image/png","Connection":"Keep-Alive","Content-Length":"42","Content-Type":"application/json; charset=utf-8","Host":"localhost:2897"},"httpVersion":"1.1","method":"POST","post":"{\"using\":\"css selector\",\"value\":\"#wpwrap\"}","url":"/element","urlParsed":{"anchor":"","query":"","file":"element","directory":"/","path":"/element","relative":"/element","port":"","host":"","password":"","user":"","userInfo":"","authority":"","protocol":"","source":"/element","queryKey":{},"chunks":["element"]},"urlOriginal":"/session/77a0a110-f421-11e2-a6fd-61cd002d7d02/element"}
Build info: version: '2.33.0', revision: '4e90c97', time: '2013-05-22 15:32:38'
System info: os.name: 'Linux', os.arch: 'i386', os.version: '2.6.32-042stab076.8', java.version: '1.7.0'
Driver info: driver.version: unknown

From what I see, it is a simple problem to solve. You have a unhandled exception that is being thrown when there is a failure: NoSuchElementException . I don't think you need to restart your grid node every time (assuming you are running your grid hub separate from your grid node instances). It may be that successive runs of the browser from your Grid Node use just enough more memory to cause a minor delay in the page, breaking your script. All you need to do is handle the NoSuchElementException gracefully by wrapping it in a retry try loop. You can also effectively do similar using FluentWait with the .ignoring method.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Nutch 1.12 exception java.io.IOException: No FileSystem for scheme: http - nutch

Related

AnalysisException: Table or view not found --- Even after I create a view using "createGlobalTempView" , how to fix?

When I write a DataFrame to a Parquet file, no errors are shown and no file is created

Spark: Frequent Pattern Mining: issues in saving the results

How to use a specific directory for metastore with HiveContext?

Selenium Webdriver requires restarts to function consistently

Categories

Resources