Problems to use the EclairJS Server - node.js

I tried to use EclairJS Server following the instructions available here: https://github.com/EclairJS/eclairjs/tree/master/server
after executing: mvn package got the following error:
Tests run: 293, Failures: 8, Errors: 9, Skipped: 0
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 04:51 min
[INFO] Finished at: 2018-04-10T07:13:41+00:00
[INFO] Final Memory: 31M/373M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.19.1:test (default-test) on project eclairjs-nashorn: There are test failures.
[ERROR]
[ERROR] Please refer to /root/eclairjs/server/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 26.528 sec - in org.eclairjs.nashorn.ZClusterTest
Running org.eclairjs.nashorn.PairRDDTest
Tests run: 8, Failures: 6, Errors: 0, Skipped: 0, Time elapsed: 2.821 sec <<< FAILURE! - in org.eclairjs.nashorn.PairRDDTest
countByKey(org.eclairjs.nashorn.PairRDDTest) Time elapsed: 0.582 sec <<< FAILURE!
org.junit.ComparisonFailure: failure - strings are not equal expected:<{"[pandas":1,"coffee":3]}> but was:<{"[coffee":3,"pandas":1]}>
at org.eclairjs.nashorn.PairRDDTest.countByKey(PairRDDTest.java:64)
cogroup2(org.eclairjs.nashorn.PairRDDTest) Time elapsed: 0.73 sec <<< FAILURE!
org.junit.ComparisonFailure: failure - strings are not equal expected:<[{"0":"[Apples","1":{"0":["Fruit"],"1":[3],"2":[42],"length":3},"length":2},{"0":"Oranges","1":{"0":["Fruit","Citrus"],"1":[2],"2":[21]],"length":3},"lengt...> but was:<[{"0":"[Oranges","1":{"0":["Fruit","Citrus"],"1":[2],"2":[21],"length":3},"length":2},{"0":"Apples","1":{"0":["Fruit"],"1":[3],"2":[42]],"length":3},"lengt...>
at org.eclairjs.nashorn.PairRDDTest.cogroup2(PairRDDTest.java:112)
cogroup3(org.eclairjs.nashorn.PairRDDTest) Time elapsed: 0.405 sec <<< FAILURE!
org.junit.ComparisonFailure: failure - strings are not equal expected:<[{"0":"[Apples","1":{"0":["Fruit"],"1":[3],"2":[42],"3":["WA"],"length":4},"length":2},{"0":"Oranges","1":{"0":["Fruit","Citrus"],"1":[2],"2":[21],"3":["FL]"],"length":4},"leng...> but was:<[{"0":"[Oranges","1":{"0":["Fruit","Citrus"],"1":[2],"2":[21],"3":["FL"],"length":4},"length":2},{"0":"Apples","1":{"0":["Fruit"],"1":[3],"2":[42],"3":["WA]"],"length":4},"leng...>
at org.eclairjs.nashorn.PairRDDTest.cogroup3(PairRDDTest.java:124)
ests run: 50, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 94.35 sec <<< FAILURE! - in org.eclairjs.nashorn.MlTest
LDAExample(org.eclairjs.nashorn.MlTest) Time elapsed: 0.005 sec <<< ERROR!
javax.script.ScriptException: TypeError: Cannot load script from examples/ml/LDA_example.js in /ml/mltest.js at line number 214
at org.eclairjs.nashorn.MlTest.LDAExample(MlTest.java:610)
Caused by: jdk.nashorn.internal.runtime.ECMAException: TypeError: Cannot load script from examples/ml/LDA_example.js
at org.eclairjs.nashorn.MlTest.LDAExample(MlTest.java:610)
Running org.eclairjs.nashorn.CoreExamplesTest
Tests run: 6, Failures: 0, Errors: 6, Skipped: 0, Time elapsed: 0.064 sec <<< FAILURE! - in org.eclairjs.nashorn.CoreExamplesTest
WordCount(org.eclairjs.nashorn.CoreExamplesTest) Time elapsed: 0.017 sec <<< ERROR!
javax.script.ScriptException: TypeError: Cannot load script from eclairjs/sql/sparkSession in file:/root/eclairjs/server/target/classes/eclairjs/jvm-npm/jvm-npm.js at line number 122
at org.eclairjs.nashorn.CoreExamplesTest.WordCount(CoreExamplesTest.java:48)
Caused by: jdk.nashorn.internal.runtime.ECMAException: TypeError: Cannot load script from eclairjs/sql/sparkSession
at org.eclairjs.nashorn.CoreExamplesTest.WordCount(CoreExamplesTest.java:48)
SparkLR(org.eclairjs.nashorn.CoreExamplesTest) Time elapsed: 0.006 sec <<< ERROR!
javax.script.ScriptException: TypeError: Cannot load script from eclairjs/sql/sparkSession in file:/root/eclairjs/server/target/classes/eclairjs/jvm-npm/jvm-npm.js at line number 122
at org.eclairjs.nashorn.CoreExamplesTest.SparkLR(CoreExamplesTest.java:88)
Caused by: jdk.nashorn.internal.runtime.ECMAException: TypeError: Cannot load script from eclairjs/sql/sparkSession
at org.eclairjs.nashorn.CoreExamplesTest.SparkLR(CoreExamplesTest.java:88)
SparkPI(org.eclairjs.nashorn.CoreExamplesTest) Time elapsed: 0.007 sec <<< ERROR!
javax.script.ScriptException: TypeError: Cannot load script from eclairjs/sql/sparkSession in file:/root/eclairjs/server/target/classes/eclairjs/jvm-npm/jvm-npm.js at line number 122
at org.eclairjs.nashorn.CoreExamplesTest.SparkPI(CoreExamplesTest.java:76)
Caused by: jdk.nashorn.internal.runtime.ECMAException: TypeError: Cannot load script from eclairjs/sql/sparkSession
at org.eclairjs.nashorn.CoreExamplesTest.SparkPI(CoreExamplesTest.java:76)
SparkTC(org.eclairjs.nashorn.CoreExamplesTest) Time elapsed: 0.006 sec <<< ERROR!
javax.script.ScriptException: TypeError: Cannot load script from eclairjs/sql/sparkSession in file:/root/eclairjs/server/target/classes/eclairjs/jvm-npm/jvm-npm.js at line number 122
at org.eclairjs.nashorn.CoreExamplesTest.SparkTC(CoreExamplesTest.java:64)
Caused by: jdk.nashorn.internal.runtime.ECMAException: TypeError: Cannot load script from eclairjs/sql/sparkSession
at org.eclairjs.nashorn.CoreExamplesTest.SparkTC(CoreExamplesTest.java:64)
PageRank(org.eclairjs.nashorn.CoreExamplesTest) Time elapsed: 0.008 sec <<< ERROR!
javax.script.ScriptException: TypeError: Cannot load script from eclairjs/sql/sparkSession in file:/root/eclairjs/server/target/classes/eclairjs/jvm-npm/jvm-npm.js at line number 122
at org.eclairjs.nashorn.CoreExamplesTest.PageRank(CoreExamplesTest.java:100)
Caused by: jdk.nashorn.internal.runtime.ECMAException: TypeError: Cannot load script from eclairjs/sql/sparkSession
at org.eclairjs.nashorn.CoreExamplesTest.PageRank(CoreExamplesTest.java:100)
LogQuery(org.eclairjs.nashorn.CoreExamplesTest) Time elapsed: 0.007 sec <<< ERROR!
javax.script.ScriptException: TypeError: Cannot load script from eclairjs/sql/sparkSession in file:/root/eclairjs/server/target/classes/eclairjs/jvm-npm/jvm-npm.js at line number 122
at org.eclairjs.nashorn.CoreExamplesTest.LogQuery(CoreExamplesTest.java:115)
Caused by: jdk.nashorn.internal.runtime.ECMAException: TypeError: Cannot load script from eclairjs/sql/sparkSession
at org.eclairjs.nashorn.CoreExamplesTest.LogQuery(CoreExamplesTest.java:115)
Can please anyone help me to get through this error or can share some to use apache spark in my node application
Thankyou

Related

Multithreaded paging query database Caused by :java.sql.SQLException: GC overhead limit exceeded

enter image description here
1、First, I will paginate the order_id from this table
2、After getting the order_id, check the data in this table
The reason for this check is to ensure that the billing of each order is complete,But I ran into a problem when I did this, and it's as follows
2022-12-15 11:16:52.798 [,] [master housekeeper] WARN com.zaxxer.hikari.pool.HikariPool - master - Thread starvation or clock leap detected (housekeeper delta=1m344ms).
Exception in thread "RiskOverdueBusiness-1" org.springframework.dao.TransientDataAccessResourceException:
### Error querying database. Cause: java.sql.SQLException: GC overhead limit exceeded
### The error may exist in qnvip/data/overview/mapper/risk/RiskOverdueBaseMapper.java (best guess)
### The error may involve defaultParameterMap
### The error occurred while setting parameters
### SQL: SELECT id,renew_term,deleted,count_day,order_id,order_no,mini_type,repay_date,real_repay_time,platform,finance_type,term,risk_level,risk_strategy,audit_type,forced_conversion,discount_return_amt,rent_total,buyout_amt,act_buyout_amt,buyout_discount,overdue_fine,bond_amt,before_discount,total_discount,bond_rate,is_overdue,real_capital,capital,repay_status,overdue,real_repay_time_status,renew_total_rent,max_term,hit_value,renew_status,renew_day,surplus_bond_amt,actual_supply_price,is_settle,order_status,current_overdue_days,surplus_amt,overdue_day,term_overdue_days,renew_type,is_deleted,version,create_time,update_time FROM dataview_risk_overdue_base WHERE is_deleted=0 AND (overdue_day = ?)
### Cause: java.sql.SQLException: GC overhead limit exceeded
; GC overhead limit exceeded; nested exception is java.sql.SQLException: GC overhead limit exceeded
at org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:110)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)
at org.mybatis.spring.MyBatisExceptionTranslator.translateExceptionIfPossible(MyBatisExceptionTranslator.java:88)
at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:440)
at com.sun.proxy.$Proxy142.selectList(Unknown Source)
at org.mybatis.spring.SqlSessionTemplate.selectList(SqlSessionTemplate.java:223)
at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.executeForMany(MybatisMapperMethod.java:173)
at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.execute(MybatisMapperMethod.java:78)
at com.baomidou.mybatisplus.core.override.MybatisMapperProxy$PlainMethodInvoker.invoke(MybatisMapperProxy.java:148)
at com.baomidou.mybatisplus.core.override.MybatisMapperProxy.invoke(MybatisMapperProxy.java:89)
at com.sun.proxy.$Proxy212.selectList(Unknown Source)
at com.baomidou.mybatisplus.extension.service.IService.list(IService.java:279)
at qnvip.data.overview.service.risk.impl.RiskOverdueBaseServiceImpl.getListByOrderId(RiskOverdueBaseServiceImpl.java:61)
at qnvip.data.overview.service.risk.impl.RiskOverdueBaseServiceImpl$$FastClassBySpringCGLIB$$31ab1dda.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687)
at qnvip.data.overview.service.risk.impl.RiskOverdueBaseServiceImpl$$EnhancerBySpringCGLIB$$e7698a08.getListByOrderId(<generated>)
at qnvip.data.overview.business.risk.RiskOverdueBusinessNew.oldRepayTask(RiskOverdueBusinessNew.java:95)
at qnvip.data.overview.business.risk.RiskOverdueBusinessNew.lambda$execOldData$1(RiskOverdueBusinessNew.java:82)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: GC overhead limit exceeded
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953)
at com.mysql.cj.jdbc.ClientPreparedStatement.execute(ClientPreparedStatement.java:370)
at com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44)
at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java)
at org.apache.ibatis.executor.statement.PreparedStatementHandler.query(PreparedStatementHandler.java:64)
at org.apache.ibatis.executor.statement.RoutingStatementHandler.query(RoutingStatementHandler.java:79)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:63)
at com.sun.proxy.$Proxy481.query(Unknown Source)
at com.baomidou.mybatisplus.core.executor.MybatisSimpleExecutor.doQuery(MybatisSimpleExecutor.java:69)
at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:325)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156)
at com.baomidou.mybatisplus.core.executor.MybatisCachingExecutor.query(MybatisCachingExecutor.java:165)
at com.baomidou.mybatisplus.extension.plugins.MybatisPlusInterceptor.intercept(MybatisPlusInterceptor.java:81)
at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:61)
at com.sun.proxy.$Proxy480.query(Unknown Source)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:147)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:140)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:426)
... 18 more
2022-12-15 11:17:12.040 [,] [slave housekeeper] WARN com.zaxxer.hikari.pool.HikariPool - slave - Thread starvation or clock leap detected (housekeeper delta=59s747ms).
Exception in thread "RiskOverdueBusiness-5" org.springframework.dao.TransientDataAccessResourceException:
### Error querying database. Cause: java.sql.SQLException: Can not read response from server. Expected to read 321 bytes, read 167 bytes before connection was unexpectedly lost.
### The error may exist in qnvip/data/overview/mapper/risk/RiskOverdueBaseMapper.java (best guess)
### The error may involve defaultParameterMap
### The error occurred while setting parameters
### Cause: java.sql.SQLException: Can not read response from server. Expected to read 321 bytes, read 167 bytes before connection was unexpectedly lost.
; Can not read response from server. Expected to read 321 bytes, read 167 bytes before connection was unexpectedly lost.; nested exception is java.sql.SQLException: Can not read response from server. Expected to read 321 bytes, read 167 bytes before connection was unexpectedly lost.
at org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:110)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:81)
at com.sun.proxy.$Proxy142.selectList(Unknown Source)
at org.mybatis.spring.SqlSessionTemplate.selectList(SqlSessionTemplate.java:223)
at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.executeForMany(MybatisMapperMethod.java:173)
at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.execute(MybatisMapperMethod.java:78)
at com.baomidou.mybatisplus.core.override.MybatisMapperProxy$PlainMethodInvoker.invoke(MybatisMapperProxy.java:148)
at com.baomidou.mybatisplus.core.override.MybatisMapperProxy.invoke(MybatisMapperProxy.java:89)
at com.sun.proxy.$Proxy212.selectList(Unknown Source)
at com.baomidou.mybatisplus.extension.service.IService.list(IService.java:279)
at qnvip.data.overview.service.risk.impl.RiskOverdueBaseServiceImpl.getListByOrderId(RiskOverdueBaseServiceImpl.java:61)
at qnvip.data.overview.service.risk.impl.RiskOverdueBaseServiceImpl$$FastClassBySpringCGLIB$$31ab1dda.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687)
at qnvip.data.overview.service.risk.impl.RiskOverdueBaseServiceImpl$$EnhancerBySpringCGLIB$$e7698a08.getListByOrderId(<generated>)
at qnvip.data.overview.business.risk.RiskOverdueBusinessNew.oldRepayTask(RiskOverdueBusinessNew.java:95)
at qnvip.data.overview.business.risk.RiskOverdueBusinessNew.lambda$execOldData$1(RiskOverdueBusinessNew.java:82)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.sql.SQLException: Can not read response from server. Expected to read 321 bytes, read 167 bytes before connection was unexpectedly lost.
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953)
at com.mysql.cj.jdbc.ClientPreparedStatement.execute(ClientPreparedStatement.java:370)
at com.zaxxer.hikari.pool.ProxyPreparedStatement.execute(ProxyPreparedStatement.java:44)
at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.execute(HikariProxyPreparedStatement.java)
at org.apache.ibatis.executor.statement.PreparedStatementHandler.query(PreparedStatementHandler.java:64)
at org.apache.ibatis.executor.statement.RoutingStatementHandler.query(RoutingStatementHandler.java:79)
at sun.reflect.GeneratedMethodAccessor209.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:63)
at com.sun.proxy.$Proxy481.query(Unknown Source)
at com.baomidou.mybatisplus.core.executor.MybatisSimpleExecutor.doQuery(MybatisSimpleExecutor.java:69)
at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:325)
at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156)
at com.baomidou.mybatisplus.core.executor.MybatisCachingExecutor.query(MybatisCachingExecutor.java:165)
at com.baomidou.mybatisplus.extension.plugins.MybatisPlusInterceptor.intercept(MybatisPlusInterceptor.java:81)
at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:61)
at com.sun.proxy.$Proxy480.query(Unknown Source)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:147)
at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:140)
at sun.reflect.GeneratedMethodAccessor210.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:426)
... 18 more
I thought the query was too large to release the reference in time, but it didn't seem to work. How do I deal with exceptions to make sure that all the data gets queried
enter image description here

IllegalArgumentException: Unknown message type: 9 while reading "delta" file

I am using <spark.version>3.1.2</spark.version> with "delta" lake io.delta:delta-core_2.12:1.0.0 in my project.
While reading "delta" file I am getting below IllegalArgumentException: Unknown message type: 9 error
java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Job aborted due to stage failure: ShuffleMapStage 4 ($anonfun$apply$2 at DatabricksLogging.scala:77) has failed the maximum allowable number of times: 4. Most recent failure reason: org.apache.spark.shuffle.FetchFailedException: java.lang.IllegalArgumentException: Unknown message type: 9 at org.apache.spark.network.shuffle.protocol.BlockTransferMessage$Decoder.fromByteBuffer(BlockTransferMessage.java:71) at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.receive(ExternalShuffleBlockHandler.java:80) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 more
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135)
at com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410)
at org.apache.spark.sql.delta.DeltaLog$.apply(DeltaLog.scala:464)
at org.apache.spark.sql.delta.DeltaLog$.forTable(DeltaLog.scala:401)
at org.apache.spark.sql.delta.catalog.DeltaTableV2.deltaLog$lzycompute(DeltaTableV2.scala:73)
at org.apache.spark.sql.delta.sources.DeltaDataSource.createRelation(DeltaDataSource.scala:177)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:355)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:325)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$1(DataFrameReader.scala:305)
at scala.Option.map(Option.scala:230)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:265)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: ShuffleMapStage 4 ($anonfun$apply$2 at DatabricksLogging.scala:77) has failed the maximum allowable number of times: 4. Most recent failure reason: org.apache.spark.shuffle.FetchFailedException: java.lang.IllegalArgumentException: Unknown message type: 9 at org.apache.spark.network.shuffle.protocol.BlockTransferMessage$Decoder.fromByteBuffer(BlockTransferMessage.java:71) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Unknown message type: 9 at org.apache.spark.network.shuffle.protocol.BlockTransferMessage$Decoder.fromByteBuffer(BlockTransferMessage.java:71) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 more
at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2258)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:868)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2196)
I am submitting spark job as below
export SPARK_HOME=/spark-3.1.2-bin-hadoop3.2
$SPARK_HOME/bin/spark-submit \
--master yarn \
--deploy-mode cluster \
--packages org.apache.hadoop:hadoop-aws:2.9.2,io.delta:delta-core_2.12:1.0.0,org.apache.hudi:hudi-spark-bundle_2.12:0.6.0
what is wrong here ? any clue ? any help highly appriciated.
I encountered something similar and similar to https://issues.apache.org/jira/browse/SPARK-33093
Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: Unknown message type: 9
at org.apache.spark.network.shuffle.protocol.BlockTransferMessage$Decoder.fromByteBuffer(BlockTransferMessage.java:71)
at org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.receive(ExternalShuffleBlockHandler.java:81)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:157)
the following helped me as well
spark.shuffle.useOldFetchProtocol=true

How to fix this error running Nutch 1.15 ERROR fetcher.Fetcher - Fetcher job did not succeed, job status:FAILED, reason: NA

When I'm starting a crawl using Nutch 1.15 with this:
/usr/local/nutch/bin/crawl --i -s urls/seed.txt crawldb 5
Then it starts to run and I get this error when it tries to fetch:
2019-02-10 15:29:32,021 INFO mapreduce.Job - Running job: job_local1267180618_0001
2019-02-10 15:29:32,145 INFO fetcher.FetchItemQueues - Using queue mode : byHost
2019-02-10 15:29:32,145 INFO fetcher.Fetcher - Fetcher: threads: 50
2019-02-10 15:29:32,145 INFO fetcher.Fetcher - Fetcher: time-out divisor: 2
2019-02-10 15:29:32,149 INFO fetcher.QueueFeeder - QueueFeeder finished: total 1 records hit by time limit : 0
2019-02-10 15:29:32,234 WARN mapred.LocalJobRunner - job_local1267180618_0001
java.lang.Exception: java.lang.NullPointerException
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NullPointerException
at org.apache.nutch.net.URLExemptionFilters.<init>(URLExemptionFilters.java:39)
at org.apache.nutch.fetcher.FetcherThread.<init>(FetcherThread.java:154)
at org.apache.nutch.fetcher.Fetcher$FetcherRun.run(Fetcher.java:222)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2019-02-10 15:29:33,023 INFO mapreduce.Job - Job job_local1267180618_0001 running in uber mode : false
2019-02-10 15:29:33,025 INFO mapreduce.Job - map 0% reduce 0%
2019-02-10 15:29:33,028 INFO mapreduce.Job - Job job_local1267180618_0001 failed with state FAILED due to: NA
2019-02-10 15:29:33,038 INFO mapreduce.Job - Counters: 0
2019-02-10 15:29:33,039 ERROR fetcher.Fetcher - Fetcher job did not succeed, job status:FAILED, reason: NA
2019-02-10 15:29:33,039 ERROR fetcher.Fetcher - Fetcher: java.lang.RuntimeException: Fetcher job did not succeed, job status:FAILED, reason: NA
at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:503)
at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:543)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:517)
And I get this error in the console which is the command it runs:
Error running:
/usr/local/nutch/bin/nutch fetch -D mapreduce.job.reduces=2 -D mapred.child.java.opts=-Xmx1000m -D mapreduce.reduce.speculative=false -D mapreduce.map.speculative=false -D mapreduce.map.output.compress=true -D fetcher.timelimit.mins=180 crawlsites/segments/20190210152929 -noParsing -threads 50
I had to delete the nutch folder and do a new install and it worked after this.

Get an "IOException: Broken pipe" during submiting a spark job which is connecting hbase by pyspark code

I submit a spark job to do some easy stuff by pyspark newAPIHadoopRDD, which will connecting hbase during the job running. Our CHD enable the kerberos, But I think I have pass the authentication.
I will show my code, shell, exception and some CM config.
>
"19/01/16 10:55:42 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x36850456cea05e5
19/01/16 10:55:42 INFO zookeeper.ZooKeeper: Session: 0x36850456cea05e5 closed
Traceback (most recent call last):
File "/home/xxx/xxx/xxx_easy_hbase.py", line 36, in <module>
conf=hbaseconf)
File "/opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 644, in newAPIHadoopRDD
File "/opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
File "/opt/cloudera/parcels/CDH-5.13.3-1.cdh5.13.3.p0.2/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError19/01/16 10:55:42 INFO zookeeper.ClientCnxn: EventThread shut down
: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=32, exceptions:
Wed Jan 16 10:55:42 CST 2019, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68449: row 'event_opinion_type,,00000000000000' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=xxx.com,60020,1547543835462, seqNum=0
at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:320)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:247)
at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:62)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:210)
at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:327)
at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:302)
at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:167)
at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:162)
...
Caused by: java.net.SocketTimeoutException: callTimeout=60000, callDuration=68449: row 'event_opinion_type,,00000000000000' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=xxx.com,60020,1547543835462, seqNum=0
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:169)
at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 more
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
... 4 more"

Spark-kafka : org.apache.kafka.common.errors.TimeoutException while writing stream from Spark

I am facing an issue while writing the stream on the topic from Spark.
import org.apache.spark.sql.types._
val mySchema = StructType(Array(
StructField("ID", IntegerType),
StructField("ACCOUNT_NUMBER", StringType)
))
val streamingDataFrame = spark.readStream.schema(mySchema).option("delimiter",",")
.csv("file:///opt/files")
streamingDataFrame.selectExpr("CAST(id AS STRING) AS key", "to_json(struct(*)) AS value")
.writeStream.format("kafka")
.option("topic", "testing")
.option("kafka.bootstrap.servers", "10.55.55.55:9092")
.option("checkpointLocation", "file:///opt/")
.start().awaitTermination()
Error:
2018-09-12 11:09:04,344 ERROR executor.Executor: Exception in task 0.0 in stage 0.0 (TID 0)
org.apache.kafka.common.errors.TimeoutException: Expiring 38 record(s) for testing-0: 30016 ms has passed since batch creation plus linger time
2018-09-12 11:09:04,358 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.kafka.common.errors.TimeoutException: Expiring 38 record(s) for testing-0: 30016 ms has passed since batch creation plus linger time
2018-09-12 11:09:04,359 ERROR scheduler.TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
2018-09-12 11:09:04,370 ERROR streaming.StreamExecution: Query [id = 866e4416-138a-42b6-82fd-04b6ee1aa638, runId = 4dd10740-29dd-4275-97e2-a43104d71cf5] terminated with error
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.kafka.common.errors.TimeoutException: Expiring 38 record(s) for testing-0: 30016 ms has passed since batch creation plus linger time
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
My sbt details:
libraryDependencies += "org.apache.spark" %% "spark-sql-kafka-0-10" % "2.2.0"
libraryDependencies += "org.apache.kafka" % "kafka-clients" % "0.10.0.0"
But when I send message through server using bin/kafka-console-producer.sh and bin/kafka-console-consumer.sh I can send and receive message
You need to increase the value of request.timeout.ms on the client side.
Kafka groups records into batches in order to increase throughput. When a new record is added into the batch, it must be sent within the time limit. request.timeout.ms is a configurable parameter (default value is 30sec) that controls this time limit.
When a batch is queued for longer period, the a TimeoutException is being thrown and records will be removed from the queue (and therefore messages will not be delivered).

Resources