Cannot do anything with a corrupt Lake Database table - azure

I have a corrupt lake database table named api_endpoints in Azure Synapse. It was an external table, and I already deleted underlying data from the datalake. I cannot delete the table whatever I try. I presume there is a snippet of metadata which resides in the Hive Metastore - When I query the database:
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES order by TABLE_NAME
The corrupt table name is still there. I've tried to get rid of this table by trying the following things:
Simply dropping the table: spark.sql("DROP TABLE IF EXISTS api_endpoints")
Creating another database, and trying to move the corrupt table there:
CREATE DATABASE Junk;
USE raw_UtilityDB;
ALTER TABLE api_endpoints RENAME TO Junk.MyMovedCorruptTable;
DROP DATABASE JUNK Cascade;
Trying to overwrite the table using pyspark:
df = spark.read.csv('random_csv_file.csv')
df.write.saveAsTable("raw_UtilityDB.api_endpoints", mode='overwrite')
Every operation I try results in the exact same error:
Error: Cannot recognize hive type string: varchar, column: endpoint_name, db: raw_utilitydb, table: api_endpoints
org.apache.spark.sql.errors.QueryExecutionErrors$.convertHiveTableToCatalogTableError(QueryExecutionErrors.scala:1285)
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree2$1(HiveClientImpl.scala:446)
org.apache.spark.sql.hive.client.HiveClientImpl.convertHiveTableToCatalogTable(HiveClientImpl.scala:441)
org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$getTableOption$3(HiveClientImpl.scala:435)
scala.Option.map(Option.scala:230)
org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$getTableOption$1(HiveClientImpl.scala:435)
org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:306)
org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:236)
org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:235)
org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:285)
org.apache.spark.sql.hive.client.HiveClientImpl.getTableOption(HiveClientImpl.scala:433)
org.apache.spark.sql.hive.client.HiveClient.getTable(HiveClient.scala:90)
org.apache.spark.sql.hive.client.HiveClient.getTable$(HiveClient.scala:89)
org.apache.spark.sql.hive.client.HiveClientImpl.getTable(HiveClientImpl.scala:91)
org.apache.spark.sql.hive.HiveExternalCatalog.getRawTable(HiveExternalCatalog.scala:123)
org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$getTable$1(HiveExternalCatalog.scala:722)
org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102)
org.apache.spark.sql.hive.HiveExternalCatalog.getTable(HiveExternalCatalog.scala:722)
org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.getTable(ExternalCatalogWithListener.scala:138)
org.apache.spark.sql.catalyst.catalog.SessionCatalog.getTableRawMetadata(SessionCatalog.scala:574)
org.apache.spark.sql.catalyst.catalog.SessionCatalog.getTableMetadata(SessionCatalog.scala:559)
org.apache.spark.sql.execution.datasources.v2.V2SessionCatalog.loadTable(V2SessionCatalog.scala:65)
org.apache.spark.sql.connector.catalog.DelegatingCatalogExtension.loadTable(DelegatingCatalogExtension.java:68)
org.apache.spark.sql.delta.catalog.DeltaCatalog.loadTable(DeltaCatalog.scala:173)
org.apache.spark.sql.connector.catalog.CatalogV2Util$.loadTable(CatalogV2Util.scala:281)
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupTableOrView(Analyzer.scala:1300)
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$15.applyOrElse(Analyzer.scala:1294)
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$15.applyOrElse(Analyzer.scala:1237)
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138)
org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138)
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135)
org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1130)
org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1129)
org.apache.spark.sql.catalyst.plans.logical.DropTable.mapChildren(v2Commands.scala:532)
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135)
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30)
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:1237)
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:1203)
org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211)
scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
scala.collection.immutable.List.foldLeft(List.scala:91)
org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208)
org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200)
scala.collection.immutable.List.foreach(List.scala:431)
org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200)
org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:222)
org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:218)
org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:167)
org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:218)
org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:182)
org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:93)
org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:203)
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:202)
org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:78)
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:120)
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:207)
org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:207)
org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:78)
org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:76)
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:68)
org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)
org.apache.livy.repl.SQLInterpreter.execute(SQLInterpreter.scala:129)
org.apache.livy.repl.Session.$anonfun$executeCode$1(Session.scala:680)
scala.Option.map(Option.scala:230)
org.apache.livy.repl.Session.executeCode(Session.scala:677)
org.apache.livy.repl.Session.$anonfun$execute$4(Session.scala:483)
org.apache.livy.repl.Session.withRealtimeOutputSupport(Session.scala:866)
org.apache.livy.repl.Session.$anonfun$execute$1(Session.scala:483)
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
scala.util.Success.$anonfun$map$1(Try.scala:255)
scala.util.Success.map(Try.scala:213)
scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:750)
How do I get rid of the pesky thing?

Related

How to figure out if a Metastore object is a view or a table?

I am trying to figure out if there is a way to find if a metastore object is a Table or a View.
For example, we can use this SQL query:
Describe detail mydb.mytable
that returns metadata about the table:
The format column here tells us that this object is a table - "delta" (table).
Is there an equivalent query that I can use to check if an object is a View?
Note: there seems to be no equivalent for the above Describe detail query for a view. I tried bellow:
Describe detail mydb.myview
But I am getting this error:
com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Table or view not found: information_schema.tables; line 1 pos 14;
'Project [*]
+- 'UnresolvedRelation [information_schema, tables], [], false
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$2(CheckAnalysis.scala:138)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$2$adapted(CheckAnalysis.scala:105)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:358)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:357)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:357)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
...

Could not create BLOOMFILTER Index in databricks

I am trying to create BLOOMFILTER index by referring to the document -
https://docs.databricks.com/spark/2.x/spark-sql/language-manual/create-bloomfilter-index.html
I created the DELTA table by,
spark.sql("DROP TABLE IF EXISTS testdb.fact_lists")
spark.sql("CREATE TABLE testdb.fact_lists USING DELTA LOCATION '/delta/fact-lists'")
I enabled bloom filter by,
%sql
SET spark.databricks.io.skipping.bloomFilter.enabled = true;
SET delta.bloomFilter.enabled = true;
When I try to run the below CREATE statement for BLOOMFILTER I get the "no viable input" error
%sql
CREATE BLOOMFILTER INDEX
ON TABLE testdb.fact_lists
FOR COLUMNS(event_id OPTION(fpp=0.1, numItems=100))
Error:
com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.catalyst.parser.ParseException:
no viable alternative at input 'CREATE BLOOMFILTER'(line 1, pos 7)
== SQL ==
CREATE BLOOMFILTER INDEX
-------^^^
ON TABLE testdb.fact_lists
FOR COLUMNS(event_id OPTION(fpp=0.1, numItems=100))
at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:298)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:159)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:88)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:106)
at com.databricks.sql.parser.DatabricksSqlParser.$anonfun$parsePlan$1(DatabricksSqlParser.scala:77)
at com.databricks.sql.parser.DatabricksSqlParser.parse(DatabricksSqlParser.scala:97)
at com.databricks.sql.parser.DatabricksSqlParser.parsePlan(DatabricksSqlParser.scala:74)
at org.apache.spark.sql.SparkSession.$anonfun$sql$2(SparkSession.scala:801)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80)
at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:151)
at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:801)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:968)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:798)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:695)
at com.databricks.backend.daemon.driver.SQLDriverLocal.$anonfun$executeSql$1(SQLDriverLocal.scala:91)
at scala.collection.immutable.List.map(List.scala:293)
at com.databricks.backend.daemon.driver.SQLDriverLocal.executeSql(SQLDriverLocal.scala:37)
at com.databricks.backend.daemon.driver.SQLDriverLocal.repl(SQLDriverLocal.scala:145)
at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$11(DriverLocal.scala:605)
at com.databricks.logging.Log4jUsageLoggingShim$.$anonfun$withAttributionContext$1(Log4jUsageLoggingShim.scala:33)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:94)
at com.databricks.logging.Log4jUsageLoggingShim$.withAttributionContext(Log4jUsageLoggingShim.scala:31)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:205)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:204)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:60)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:240)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:225)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:60)
at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:582)
at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:615)
at scala.util.Try$.apply(Try.scala:213)
at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:607)
at com.databricks.backend.daemon.driver.DriverWrapper.executeCommandAndGetError(DriverWrapper.scala:526)
at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:561)
at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:431)
at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:374)
at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:225)
at java.lang.Thread.run(Thread.java:748)
at com.databricks.backend.daemon.driver.SQLDriverLocal.executeSql(SQLDriverLocal.scala:130)
at com.databricks.backend.daemon.driver.SQLDriverLocal.repl(SQLDriverLocal.scala:145)
at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$11(DriverLocal.scala:605)
at com.databricks.logging.Log4jUsageLoggingShim$.$anonfun$withAttributionContext$1(Log4jUsageLoggingShim.scala:33)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:94)
at com.databricks.logging.Log4jUsageLoggingShim$.withAttributionContext(Log4jUsageLoggingShim.scala:31)
at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:205)
at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:204)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:60)
at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:240)
at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:225)
at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:60)
at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:582)
at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:615)
at scala.util.Try$.apply(Try.scala:213)
at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:607)
at com.databricks.backend.daemon.driver.DriverWrapper.executeCommandAndGetError(DriverWrapper.scala:526)
at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:561)
at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:431)
at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:374)
at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:225)
at java.lang.Thread.run(Thread.java:748)
Kindly assist. Thanks in advance!
I got the same error when I used the same query to create Bloomfilter index in Databricks 10.4 LTS on my sample table.
CREATE BLOOMFILTER INDEX
ON TABLE factlists
FOR COLUMNS(id OPTION(fpp=0.1, numItems=100))
#error message
ParseException:
no viable alternative at input 'CREATE bloomfilter'(line 1, pos 7)
== SQL ==
CREATE bloomfilter INDEX
-------^^^
ON TABLE factlists
FOR COLUMNS(id OPTION(fpp=0.1, numItems=100))
The error was because of the incorrect syntax. When I used the following modified query, successful creation of Bloomfilter index was possible (OPTIONS instead of OPTION).
CREATE bloomfilter INDEX
ON TABLE factlists
FOR COLUMNS(id OPTIONS(fpp=0.1, numItems=100))
In your query, try changing the syntax i.e., OPTION to OPTIONS (the cause for the error) to overcome the error.

java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.util.UUID exception with cassandra?

I have a spring boot java application that talks to cassandra .
However one of my queries is failing .
public class ParameterisedListItemRepository {
private PreparedStatement findByIds;
public ParameterisedListItemRepository(Session session, Validator validator, ParameterisedListMsisdnRepository parameterisedListMsisdnRepository ) {
this.findByIds = session.prepare("SELECT * FROM mep_parameterisedListItem WHERE id IN ( :ids )");
}
public List<ParameterisedListItem> findAll(List<UUID> ids){
List<ParameterisedListItem> parameterisedListItemList = new ArrayList<>();
BoundStatement stmt =this.findByIds.bind();
stmt.setList("ids", ids);
session.execute(stmt)
.all()
.stream()
.map(parameterisedListItemMapper)
.forEach(parameterisedListItemList::add);
return parameterisedListItemList;
}
}
the following is the stack trace
java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.util.UUID
at com.datastax.driver.core.TypeCodec$AbstractUUIDCodec.serialize(TypeCodec.java:1626)
at com.datastax.driver.core.AbstractData.setList(AbstractData.java:358)
at com.datastax.driver.core.AbstractData.setList(AbstractData.java:374)
at com.datastax.driver.core.BoundStatement.setList(BoundStatement.java:681)
at com.openmind.primecast.repository.ParameterisedListItemRepository.findAll(ParameterisedListItemRepository.java:128)
at com.openmind.primecast.repository.ParameterisedListItemRepository$$FastClassBySpringCGLIB$$46ffc15e.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:738)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:673)
at com.openmind.primecast.repository.ParameterisedListItemRepository$$EnhancerBySpringCGLIB$$b2db3c41.findAll(<generated>)
at com.openmind.primecast.service.impl.ParameterisedListItemServiceImpl.findByParameterisedList(ParameterisedListItemServiceImpl.java:102)
at com.openmind.primecast.web.rest.ParameterisedListItemResource.getParameterisedListItemsByParameterisedList(ParameterisedListItemResource.java:94)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Any idea what is going wrong. I know this query is the problem
SELECT * FROM mep_parameterisedListItem WHERE id IN ( :ids )
any idea how I can change the findAll function to achieve the query?
this is the table definition
CREATE TABLE "Openmind".mep_parameterisedlistitem (
id uuid PRIMARY KEY,
data text,
msisdn text,
ordernumber int,
parameterisedlist uuid
) WITH COMPACT STORAGE;
Thank you.
Without knowing the table schema, my guess is that a change was made to the table so the schema no longer match the bindings in the prepared statement.
A big part of the problem is your query with SELECT *. Our recommendation for best practice is to explicitly name all the columns you're retrieving from the table. By specifying the columns in your query, you avoid surprises when the table schema changes.
In this instance, either a new column was added or an old column was dropped. With the cached prepared statement, it was expecting one column type and got another -- the ArrayList doesn't match UUID.
The solution is to re-prepare the statement and name all the columns. Cheers!

Presto CLI call system.create_empty_partition() ERROR

Environment
presto 0.215
presto-cli 0.215
presto-jdbc 0.215
Hive Table created by Presto
CREATE TABLE hive.origin.test_part (
id int,
date_key int
)
WITH (
format = 'ORC',
partitioned_by = ARRAY['date_key'],
external_location = '/user/hive/warehouse/origin.db/test_part/'
)
Presto JDBC and CLI both insert into success
partiton '20190122' doesn't exist before and insert succeeded which means rename tmp directory to /user/hive/warehouse/origin.db/test_part/date_key=20190122 succeeded.
/user/hive/warehouse/origin.db/test_part/date_key=20190122/ in hdfs
But Presto CLI CALL system.create_empty_partition() failed
CALL system.create_empty_partition( schema_name => 'origin', table_name => 'test_part', partition_columns => ARRAY['date_key'], partition_values => ARRAY['20190121'])
Full error message
com.facebook.presto.spi.PrestoException: Failed to rename hdfs://datacenter1:8020/tmp/presto-hive/b87162e5-9e48-4d43-a0e7-ecf0994fe625/date_key=20190121 to hdfs://datacenter1:8020/user/hive/warehouse/origin.db/test_part/date_key=20190121: rename returned false
at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.renameDirectory(SemiTransactionalHiveMetastore.java:1787)
at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.access$2700(SemiTransactionalHiveMetastore.java:87)
at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore$Committer.prepareAddPartition(SemiTransactionalHiveMetastore.java:1177)
at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore$Committer.access$700(SemiTransactionalHiveMetastore.java:957)
at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.commitShared(SemiTransactionalHiveMetastore.java:885)
at com.facebook.presto.hive.metastore.SemiTransactionalHiveMetastore.commit(SemiTransactionalHiveMetastore.java:807)
at com.facebook.presto.hive.HiveMetadata.commit(HiveMetadata.java:1949)
at com.facebook.presto.hive.CreateEmptyPartitionProcedure.createEmptyPartition(CreateEmptyPartitionProcedure.java:126)
at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:627)
at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:649)
at com.facebook.presto.execution.CallTask.execute(CallTask.java:160)
at com.facebook.presto.execution.CallTask.execute(CallTask.java:60)
at com.facebook.presto.execution.DataDefinitionExecution.start(DataDefinitionExecution.java:168)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
/tmp/presto-hive/ in hdfs
So
CALL system.create_empty_partition() use different 'user' to manipulate hdfs?
This is failing due to a bug that prevents it from working with non-bucketed tables. It is fixed in the 301 release.

Cassandra count query failing due to AssertionError

I am trying out Cassandra for the first time and running it locally for simple session management db. [Cassandra-2.0.4, CQL3, datastax driver 2.0.0-rc2]
The following count query works fine when there is no data in the table:
select count(*) from session_data where app_name=? and account=? and last_access > ?
But after even a single row is inserted into the table, the query fails with the following error:
java.lang.AssertionError
at org.apache.cassandra.db.filter.ExtendedFilter$WithClauses.getExtraFilter(ExtendedFilter.java:258)
at org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:1719)
at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1674)
at org.apache.cassandra.db.PagedRangeCommand.executeLocally(PagedRangeCommand.java:111)
at org.apache.cassandra.service.StorageProxy$LocalRangeSliceRunnable.runMayThrow(StorageProxy.java:1418)
at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1931)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Here is the schema I am using:
CREATE KEYSPACE session WITH replication= {'class': 'SimpleStrategy', 'replication_factor': 1};
CREATE TABLE session_data (
username text,
session_id text,
app_name text,
account text,
last_access timestamp,
created_on timestamp,
PRIMARY KEY (username, session_id, app_name, account)
);
create index sessionIndex ON session_data (session_id);
create index sessionAppName ON session_data (app_name);
create index lastAccessIndex ON session_data (last_access);
I am wondering if there is something wrong in the table definition/indexes or the query itself. Any help/insight would be greatly appreciated.
It looks like you're tripping over a bug in Cassandra. Here is the assertion and related comments in the Cassandra sources:
/*
* This method assumes the IndexExpression names are valid column names, which is not the
* case with composites. This is ok for now however since:
* 1) CompositeSearcher doesn't use it.
* 2) We don't yet allow non-indexed range slice with filters in CQL3 (i.e. this will never be
* called by CFS.filter() for composites).
*/
assert !(cfs.getComparator() instanceof CompositeType);
This code was modified between cassandra-2.0.4 and trunk as part of ticket CASSANDRA-5417, but it's not clear to me that the author was aware of this issue. The assertion was removed, but the comment was not. I would recommend submitting a bug report to the Cassandra project.

Resources