I have a project where I need to configure spark and hbase in a local environment. I downloaded spark-2.2.1, hadoop 2.7 and hbase 1.1.8 and configured accordingly on standalone single node Ubuntu 14.04 OS.
I am able to pull and push data from spark to HDFS but not with hbase.
core-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml:
[root#localhost conf]# cat hdfs-site.xml <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>dfs.namenode.servicerpc-bind-host</name>
<value>0.0.0.0</value>
</property> </configuration>
spark-env.sh
[root#localhost conf]# cat spark-env.sh
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export SPARK_WORKER_MEMORY=1g
export SPARK_WORKER_INSTANCES=1
export SPARK_MASTER_IP=127.0.0.1
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_DIR=/app/spark/tmp
# Options read in YARN client mode
export HADOOP_CONF_DIR=/opt/hadoop/etc/hadoop
export SPARK_EXECUTOR_INSTANCES=1
export SPARK_EXECUTOR_CORES=1
export SPARK_EXECUTOR_MEMORY=1G
export SPARK_DRIVER_MEMORY=1G
export SPARK_YARN_APP_NAME=Spark
export SPARK_CLASSPATH=/opt/hbase/lib/*
hbase-site.xml:
[root#localhost conf]# cat hbase-site.xml <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration> <property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value> </property> <property>
<name>hbase.cluster.distributed</name>
<value>true</value> </property> <property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value> </property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>hdfs://localhost:9000/zookeeper</value> </property> <property>
<name>hbase.master.dns.interface</name>
<value>default</value> </property> <property>
<name>hbase.master.ipc.address</name>
<value>localhost</value> </property> <property>
<name>hbase.regionserver.dns.interface</name>
<value>default</value> </property> <property>
<name>hbase.regionserver.ipc.address</name>
<value>HOSTNAME</value> </property>
<property>
<name>hbase.zookeeper.dns.interface</name>
<value>default</value> </property>
</configuration>
spark-defaults.conf:
[root#localhost conf]# cat spark-defaults.conf
spark.master
spark://127.0.0.1:7077 spark.yarn.dist.files
/opt/spark/conf/hbase-site.xml
Errors:
Even hbase lib(jars) are exported in spark-env.sh it is unable to import hbase libraries (Ex: HBaseConfiguration).
scala> import org.apache.hadoop.hbase.HBaseConfiguration
<console>:23: error: object hbase is not a member of package org.apache.hadoop
import org.apache.hadoop.hbase.HBaseConfiguration
^
If I load these jar through --drive-class-path
spark-shell --master local --driver-class-path=/opt/hbase/lib/*
scala> conf.set("hbase.zookeeper.quorum","localhost")
scala> conf.set("hbase.zookeeper.property.clientPort", "2181")
scala> val connection: Connection = ConnectionFactory.createConnection(conf)
connection: org.apache.hadoop.hbase.client.Connection = hconnection-0x2a4cb8ae
scala> val tableName = connection.getTable(TableName.valueOf("employee"))
tableName: org.apache.hadoop.hbase.client.Table = employee;hconnection-0x2a4cb8ae
scala> val insertData = new Put(Bytes.toBytes("1"))
insertData: org.apache.hadoop.hbase.client.Put = {"totalColumns":0,"row":"1","families":{}}
scala>
| insertData.addColumn(Bytes.toBytes("emp personal data "), Bytes.toBytes("Name"), Bytes.toBytes("Jeevan"))
res3: org.apache.hadoop.hbase.client.Put = {"totalColumns":1,"row":"1","families":{"emp personal data ":[{"qualifier":"Name","v
n":6,"tag":[],"timestamp":9223372036854775807}]}}
scala> insertData.addColumn(Bytes.toBytes("emp personal data "), Bytes.toBytes("City"), Bytes.toBytes("San Jose"))
res4: org.apache.hadoop.hbase.client.Put = {"totalColumns":2,"row":"1","families":{"emp personal data ":[{"qualifier":"Name","v
n":6,"tag":[],"timestamp":9223372036854775807},{"qualifier":"City","vlen":8,"tag":[],"timestamp":9223372036854775807}]}}
scala> insertData.addColumn(Bytes.toBytes("emp personal data "), Bytes.toBytes("Company"), Bytes.toBytes("Cisco"))
res5: org.apache.hadoop.hbase.client.Put = {"totalColumns":3,"row":"1","families":{"emp personal data ":[{"qualifier":"Name","v
n":6,"tag":[],"timestamp":9223372036854775807},{"qualifier":"City","vlen":8,"tag":[],"timestamp":9223372036854775807},{"qualifi
":"Company","vlen":5,"tag":[],"timestamp":9223372036854775807}]}}
scala> insertData.addColumn(Bytes.toBytes("emp personal data "), Bytes.toBytes("location"), Bytes.toBytes("San Jose"))
res6: org.apache.hadoop.hbase.client.Put = {"totalColumns":4,"row":"1","families":{"emp personal data ":[{"qualifier":"Name","v
n":6,"tag":[],"timestamp":9223372036854775807},{"qualifier":"City","vlen":8,"tag":[],"timestamp":9223372036854775807},{"qualifi
":"Company","vlen":5,"tag":[],"timestamp":9223372036854775807},{"qualifier":"location","vlen":8,"tag":[],"timestamp":9223372036
4775807}]}}
but I dont see any new column in Hbase.
Can any one help please. any reference to configuration would be great. do i need to configure any zookeeper? appreciate your help.
Related
following are all oozie files which i have been using to run job. I have created folder on hdfs /test/jar and put workflow.xml and coordinator.xml file.
Properties File
nameNode=hdfs://host:8020
jobTracker=host:8050
queueName=default
oozie.use.system.lib.path=trueoozie.coord.application.path=${nameNode}/test/jar/coordinator.xml
oozie.action.sharelib.for.spark=spark2
start=2019-05-22T07:37Z
end=2019-05-22T07:40Z
freq=*/1 * * * *
zone=UTC
user.name=oozie
oozie.action.sharelib.for.spark.exclusion=oozie/jackson
#oozie.libpath=${nameNode}/user/oozie/share/lib
Coordinator File
<coordinator-app xmlns = "uri:oozie:coordinator:0.5" name = "test" frequency = "${freq}" start = "${start}" end = "${end}" timezone = "${zone}">
<controls>
<timeout>1</timeout>
</controls>
<action>
<workflow>
<app-path>${nameNode}/test/jar/workflow.xml</app-path>
</workflow>
</action>
</coordinator-app>
Workflow file
<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.5">
<start to="test" />
<action name="test">
<spark xmlns="uri:oozie:spark-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<master>yarn</master>
<mode>cluster</mode>
<name>Spark Example</name>
<class>com.spark.excel.mysql.executor.Executor</class>
<jar>${nameNode}/test/jar/com.spark.excel.mysql-0.1.jar</jar>
<spark-opts>--executor-memory 2G --num-executors 2</spark-opts>
</spark>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Workflow failed, error message [${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end" />
</workflow-app>
I have setup sharelib path also. Oozie is showing spark2 also through shareliblist and added oozie-sharelib-spark.jar file also in spark2.Ozzie job is submission and running also, but when it try to execute spark job then throughing error.
I had the same error. In my case I had to add in the properties file
oozie.use.system.libpath=true
I am working with Apache Spark and Apache Ignite. I have a spark dataset which I wrote in Ignite using following code
dataset.write()
.mode(SaveMode.Overwrite)
.format(FORMAT_IGNITE())
.option(OPTION_CONFIG_FILE(), "ignite-server-config.xml")
.option(OPTION_TABLE(), "CUSTOM_VALUES")
.option(OPTION_CREATE_TABLE_PRIMARY_KEY_FIELDS(), "ID")
.save();
And I am reading it again to perform group by operation which will be pushed to Ignite.
Dataset igniteDataset = sparkSession.read()
.format(FORMAT_IGNITE())
.option(OPTION_CONFIG_FILE(), "ignite-server-config.xml")
.option(OPTION_TABLE(), "CUSTOM_VALUES")
.load();
RelationalGroupedDataset idGroupedData = igniteDataset.groupBy(customized_id);
Dataset<Row> result = idGroupedData.agg(count(id).as("count_id"),
count(fid).as("count_custom_field_id"),
count(type).as("count_customized_type"),
count(val).as("count_value"), count(customized_id).as("groupCount"));
Now, I want to get the number of rows returned by groupby action. So, I am calling count() on dataset asresult.count();
When I do this, I get following exception.
Caused by: org.h2.jdbc.JdbcSQLException: Syntax error in SQL statement "SELECT COUNT(1) AS COUNT FROM (SELECT FROM CUSTOM_VALUES GROUP[*] BY CUSTOMIZED_ID) TABLE1 "; expected "., (, USE, AS, RIGHT, LEFT, FULL, INNER, JOIN, CROSS, NATURAL, ,, SELECT"; SQL statement:
SELECT COUNT(1) AS count FROM (SELECT FROM CUSTOM_VALUES GROUP BY CUSTOMIZED_ID) table1 [42001-197]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:357)
at org.h2.message.DbException.getSyntaxError(DbException.java:217)
Other functions such as show(), collectAsList().size(); works.
What am I missing here ?
I tested your example against the last community version 8.7.5 of GridGain that is the opensource version of Gridgain based on Ignite 2.7.0 sources with a subset of additional fixes (https://www.gridgain.com/resources/download).
Here is the code:
public class Main {
public static void main(String[] args) {
if (args.length < 1)
throw new IllegalArgumentException("You should set the path to client configuration file.");
String configPath = args[0];
SparkSession session = SparkSession.builder()
.enableHiveSupport()
.getOrCreate();
Dataset<Row> igniteDataset = session.read()
.format(IgniteDataFrameSettings.FORMAT_IGNITE()) //Data source
.option(IgniteDataFrameSettings.OPTION_TABLE(), "Person") //Table to read.
.option(IgniteDataFrameSettings.OPTION_CONFIG_FILE(), configPath) //Ignite config.
.load();
RelationalGroupedDataset idGroupedData = igniteDataset.groupBy("CITY_ID");
Dataset<Row> result = idGroupedData.agg(count("id").as("count_id"),
count("city_id").as("count_city_id"),
count("name").as("count_name"),
count("age").as("count_age"),
count("company").as("count_company"));
result.show();
session.close();
}
}
Here are the maven dependencies:
<dependencies>
<dependency>
<groupId>org.gridgain</groupId>
<artifactId>gridgain-core</artifactId>
<version>8.7.5</version>
</dependency>
<dependency>
<groupId>org.gridgain</groupId>
<artifactId>ignite-core</artifactId>
<version>8.7.5</version>
</dependency>
<dependency>
<groupId>org.gridgain</groupId>
<artifactId>ignite-spring</artifactId>
<version>8.7.5</version>
</dependency>
<dependency>
<groupId>org.gridgain</groupId>
<artifactId>ignite-indexing</artifactId>
<version>8.7.5</version>
</dependency>
<dependency>
<groupId>org.gridgain</groupId>
<artifactId>ignite-spark</artifactId>
<version>8.7.5</version>
</dependency>
</dependencies>
Here is the cache configuration:
<property name="cacheConfiguration">
<list>
<bean class="org.apache.ignite.configuration.CacheConfiguration">
<property name="name" value="Person"/>
<property name="cacheMode" value="PARTITIONED"/>
<property name="atomicityMode" value="ATOMIC"/>
<property name="sqlSchema" value="PUBLIC"/>
<property name="queryEntities">
<list>
<bean class="org.apache.ignite.cache.QueryEntity">
<property name="keyType" value="PersonKey"/>
<property name="valueType" value="PersonValue"/>
<property name="tableName" value="Person"/>
<property name="keyFields">
<list>
<value>id</value>
<value>city_id</value>
</list>
</property>
<property name="fields">
<map>
<entry key="id" value="java.lang.Integer"/>
<entry key="city_id" value="java.lang.Integer"/>
<entry key="name" value="java.lang.String"/>
<entry key="age" value="java.lang.Integer"/>
<entry key="company" value="java.lang.String"/>
</map>
</property>
<property name="aliases">
<map>
<entry key="id" value="id"/>
<entry key="city_id" value="city_id"/>
<entry key="name" value="name"/>
<entry key="age" value="age"/>
<entry key="company" value="company"/>
</map>
</property>
</bean>
</list>
</property>
</bean>
</list>
</property>
Using Spark 2.3.0 that is only supported for ignite-spark dependency I have next result on my test data:
Data:
ID,CITY_ID,NAME,AGE,COMPANY,
4,1,Justin Bronte,23,bank,
3,1,Helen Richard,49,bank,
Result:
+-------+--------+-------------+----------+---------+-------------+
|CITY_ID|count_id|count_city_id|count_name|count_age|count_company|
+-------+--------+-------------+----------+---------+-------------+
| 1| 2| 2| 2| 2| 2|
+-------+--------+-------------+----------+---------+-------------+
Also, this code could be fully applied to Ignite 2.7.0.
I have some Java code that performs introspection on the schema of Cassandra tables. After upgrading the Cassandra driver dependency, this code is no longer working as expected. With the old driver version, the type for a timestamp column was returned from ColumnMetadata#getType() as DataType.Name#TIMESTAMP. With the new driver, the same call returns DataType.Name#CUSTOM and CustomType#getCustomTypeClassName returning org.apache.cassandra.db.marshal.DateType.
The old driver version is com.datastax.cassandra:cassandra-driver-core:2.1.9:
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>2.1.9</version>
</dependency>
The new driver version is com.datastax.cassandra:dse-driver:1.1.2:
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>dse-driver</artifactId>
<version>1.1.2</version>
</dependency>
The cluster version is DataStax Enterprise 2.1.11.969:
cqlsh> SELECT release_version FROM system.local;
release_version
-----------------
2.1.11.969
To illustrate the problem, I created a simple console application that prints column metadata for a specified table. (See below.) When built with the old driver, the output looks like this:
# old driver
mvn -Pcassandra-driver clean package
java -jar target/cassandra-print-column-metadata-cassandra-driver.jar <address> <user> <password> <keyspace> <table>
...
ts timestamp
...
When built with the new driver, the output looks like this:
# new driver
mvn -Pdse-driver clean package
java -jar target/cassandra-print-column-metadata-dse-driver.jar <address> <user> <password> <keyspace> <table>
...
ts 'org.apache.cassandra.db.marshal.DateType'
...
So far, I have only encountered this problem with timestamp columns. I have not seen it for any other data types, though my schema does not exhaustively use all of the supported data types.
DESCRIBE TABLE shows that the column is timestamp. system.schema_columns shows that the validator is org.apache.cassandra.db.marshal.DateType.
[cqlsh 3.1.7 | Cassandra 2.1.11.969 | CQL spec 3.0.0 | Thrift protocol 19.39.0]
cqlsh:my_keyspace> DESCRIBE TABLE my_table;
CREATE TABLE my_table (
prim_addr text,
ch text,
received_on timestamp,
...
PRIMARY KEY (prim_addr, ch, received_on)
) WITH
bloom_filter_fp_chance=0.100000 AND
caching='{"keys":"ALL", "rows_per_partition":"NONE"}' AND
comment='emm_ks' AND
dclocal_read_repair_chance=0.000000 AND
gc_grace_seconds=864000 AND
read_repair_chance=0.100000 AND
compaction={'sstable_size_in_mb': '160', 'class': 'LeveledCompactionStrategy'} AND
compression={'sstable_compression': 'SnappyCompressor'};
cqlsh:system> SELECT * FROM system.schema_columns WHERE keyspace_name = 'my_keyspace' AND columnfamily_name = 'my_table' AND column_name IN ('prim_addr', 'ch', 'received_on');
keyspace_name | columnfamily_name | column_name | component_index | index_name | index_options | index_type | type | validator
---------------+-------------------+-------------+-----------------+------------+---------------+------------+----------------+------------------------------------------
my_keyspace | my_table | ch | 0 | null | null | null | clustering_key | org.apache.cassandra.db.marshal.UTF8Type
my_keyspace | my_table | prim_addr | null | null | null | null | partition_key | org.apache.cassandra.db.marshal.UTF8Type
my_keyspace | my_table | received_on | 1 | null | null | null | clustering_key | org.apache.cassandra.db.marshal.DateType
Is this a bug in the driver, an intentional change in behavior, or some kind of misconfiguration on my part?
pom.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>cnauroth</groupId>
<artifactId>cassandra-print-column-metadata</artifactId>
<version>0.0.1-SNAPSHOT</version>
<description>Console application that prints Cassandra table column metadata</description>
<name>cassandra-print-column-metadata</name>
<packaging>jar</packaging>
<properties>
<maven.compiler.source>1.7</maven.compiler.source>
<maven.compiler.target>1.7</maven.compiler.target>
<slf4j.version>1.7.25</slf4j.version>
</properties>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<addDefaultImplementationEntries>true</addDefaultImplementationEntries>
<mainClass>cnauroth.Main</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<finalName>${project.artifactId}</finalName>
<appendAssemblyId>false</appendAssemblyId>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<profiles>
<profile>
<id>dse-driver</id>
<activation>
<activeByDefault>true</activeByDefault>
</activation>
<dependencies>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>dse-driver</artifactId>
<version>1.1.2</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<finalName>${project.artifactId}-dse-driver</finalName>
</configuration>
</plugin>
</plugins>
</build>
</profile>
<profile>
<id>cassandra-driver</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<dependencies>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>2.1.9</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<finalName>${project.artifactId}-cassandra-driver</finalName>
</configuration>
</plugin>
</plugins>
</build>
</profile>
</profiles>
<dependencies>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>${slf4j.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>${slf4j.version}</version>
</dependency>
</dependencies>
</project>
Main.java
package cnauroth;
import java.util.List;
import com.datastax.driver.core.Cluster;
import com.datastax.driver.core.ColumnMetadata;
import com.datastax.driver.core.Session;
class Main {
public static void main(String[] args) throws Exception {
// Skipping validation for brevity
String address = args[0];
String user = args[1];
String password = args[2];
String keyspace = args[3];
String table = args[4];
try (Cluster cluster = new Cluster.Builder()
.addContactPoints(address)
.withCredentials(user, password)
.build()) {
List<ColumnMetadata> columns =
cluster.getMetadata().getKeyspace(keyspace).getTable(table).getColumns();
for (ColumnMetadata column : columns) {
System.out.println(column);
}
}
}
}
It looks like the internal Cassandra type used for Timestamp changed from org.apache.cassandra.db.marshal.DateType and org.apache.cassandra.db.marshal.TimestampType between Cassandra 1.2 and 2.0 (CASSANDRA-5723). If you created the table with Cassandra 1.2 (or a DSE compatible version) DateType would be used (even if you upgraded your cluster later).
It appears that the 2.1 version of the java driver was able to account for this (source) but starting with 3.0 it does not (source). Instead, it parses it as a Custom type.
Fortunately, the driver is still able to serialize and deserialize this column as the cql timestamp type is communicated over the protocol in responses, but it's a bug that the driver parses this as the wrong type. I went ahead and created JAVA-1561 to track this.
If you were to migrate your cluster to C* 3.0+ or DSE 5.0+ I suspect the problem goes away as the schema tables reference the cql name instead of the representative Java class name (unless it is indeed a custom type).
Spark Command:
spark-submit \
--class com.dev.SparkHiveToHdfs \
--jars /home/dev/dbJars/datanucleus-api-jdo-3.2.6.jar,/home/dev/dbJars/datanucleus-rdbms-3.2.9.jar,/home/dev/dbJars/datanucleus-core-3.2.10.jar \
--master yarn-cluster \
--name DCA_SPARK_JOB \
/home/dev/dbJars/data-connector-spark.jar dev.emp
data-connector-spark.jar contains below code:
public class SparkHiveToHdfs {
public static void main(String[] args) throws Exception {
String hiveTableNameWithSchema = args[0];
SparkConf conf = new SparkConf(true).setMaster("yarn-cluster").setAppName("DCA_HIVE_HDFS");
SparkContext sc = new SparkContext(conf);
HiveContext hc = new HiveContext(sc);
DataFrame df = hc.sql("select * from "+hiveTableNameWithSchema);
df.printSchema();
}
}
Properties in hive-site.xml in $SPARK_HOME/conf:
<property>
<name>hive.metastore.client.connect.retry.delay</name>
<value>5</value>
</property>
<property>
<name>hive.metastore.client.socket.timeout</name>
<value>1800</value>
</property>
<property>
<name>hive.metastore.connect.retries</name>
<value>24</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://xxxx:9083</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.transport.mode</name>
<value>binary</value>
</property>
Error log:
ERROR ApplicationMaster: User class threw exception: org.apache.spark.sql.AnalysisException: Table not found: `dev`.`emp`; line 1 pos 18
org.apache.spark.sql.AnalysisException: Table not found: `dev`.`emp`; line 1 pos 18
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:54)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:50)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:121)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:120)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:120)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:120)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:50)
at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817)
at com.impetus.idw.data.connector.SparkHiveToHdfs.main(SparkHiveToHdfs.java:30)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:559)
Please try passing the hive-site.xml in the spark submit command.
spark-submit \
--class com.dev.SparkHiveToHdfs \
--jars /home/dev/dbJars/datanucleus-api-jdo-3.2.6.jar,/home/dev/dbJars/datanucleus-rdbms-3.2.9.jar,/home/dev/dbJars/datanucleus-core-3.2.10.jar \
--master yarn-cluster \
--name DCA_SPARK_JOB \
--files hive-site.xml
/home/dev/dbJars/data-connector-spark.jar dev.emp
My web app got normal skinng working for (e)css files.
I got 2 skins:
Normal
plainTxt=#006
plainBg=#FFF
readonlyTxt=#003082
menuTxt=#0000A0
blackwhite
plainTxt=#FFF
plainBg=#000
readonlyTxt=#D0D0D0
menuTxt=#000
The colors work.
But now i have the following issue: images
For each skin i want different images.
How do i get that? That i can't find in any ecss/skinning tutorial. Is it not possible?
Example css:
#header h1 {
background-position:0 6px;
background-image: url("#{resource['image/header_1.png']}");
}
Question: For the normal skin i would like to get the image: 'image/header_1.png'
For the blackwhite skin i would like: 'image/header_2.png'
How do i do this. Any pointers at all would be great
Should i solve it in the resource servlet (or something)?
or do i maybe need to fix it in the maven plugin? Of that part i understand even less ;-)
Part of the pom.xml
<plugin>
<groupId>org.richfaces.cdk</groupId>
<artifactId>maven-richfaces-resources-plugin</artifactId>
<version>4.3.7.Final</version>
<executions>
<execution>
<id>process-resources</id>
<goals>
<goal>process</goal>
</goals>
<configuration>
<staticResourceMappingFile>D:\workspace\MyProject\target\classes/META-INF/richfaces/custom-packedcompressed-resource-mappings.properties</staticResourceMappingFile>
<resourcesOutputDir>D:\workspace\MyProject\target\classes/META-INF/resources/org.richfaces.staticResource/4.3.7.Final/PackedCompressed/</resourcesOutputDir>
<staticResourcePrefix>org.richfaces.staticResource/4.3.7.Final/PackedCompressed/</staticResourcePrefix>
<pack>true</pack>
<compress>true</compress>
<excludedFiles>
<exclude>^javax.faces</exclude>
<exclude>^\Qorg.richfaces.renderkit.html.images.\E.*</exclude>
<exclude>^\Qorg.richfaces.renderkit.html.iconimages.\E.*</exclude>
<exclude>^jquery\.js$</exclude>
</excludedFiles>
<webRoot>D:\workspace\MyProject/src/main/webapp</webRoot>
</configuration>
</execution>
</executions>
<dependencies>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>1.7.5</version>
<scope>compile</scope>
</dependency>
</dependencies>
<configuration>
<skins>
<skin>normal</skin>
<skin>blackwhite</skin>
</skins>
<excludedFiles>
<exclude>^\Qorg.richfaces.renderkit.html.images.\E.*</exclude>
<exclude>^\Qorg.richfaces.renderkit.html.iconimages.\E.*</exclude>
<exclude>^jquery.js$</exclude>
</excludedFiles>
<includedContentTypes>
<include>application/javascript</include>
<include>text/css</include>
<include>image/.+</include>
</includedContentTypes>
<fileNameMappings>
<property>
<name>^org\.richfaces\.ckeditor/([^/]+\.(png|gif|jpg))$</name>
<value>org.richfaces.ckeditor.images/$1</value>
</property>
<property>
<name>^org\.richfaces\.ckeditor/([^/]+\.css)$</name>
<value>org.richfaces.ckeditor.css/$1</value>
</property>
<property>
<name>^org\.richfaces\.ckeditor/([^/]+\.(js))$</name>
<value>org.richfaces.ckeditor.js/$1</value>
</property>
<property>
<name>^.+/([^/]+\.(png|gif|jpg))$</name>
<value>org.richfaces.images/$1</value>
</property>
<property>
<name>^.+/([^/]+\.css)$</name>
<value>org.richfaces.css/$1</value>
</property>
</fileNameMappings>
</configuration>
</plugin>
The solution was not complicated but not satisfying.
I moved the reference of the resource file to the skin property file:
custom.ecss :
...
.btn-accept{
height:22px;
width:47px;
background: '#{richSkin.okBtnImg}';
}
...
default.skin.properties :
....
okBtnImg=url("#{resource['image/button/okBtnImg.png']}") no-repeat 0 0 !important;
...
custom.skin.properties :
....
okBtnImg=url("#{resource['image/customSkin/button/okBtnImg.png']}") no-repeat 0 0 !important;
...